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Introduction 



The mean field theory of disordered systems is a well established topic in statistical mechanics, devel- 
oped in the past thirty years with remarkable success (see [T] for the classical reference). Originally, 
interest in this field was motivated by the experimental discovery of spin glasses, metallic compounds 
formed by diluting a ferromagnetic metal in a diamagnetic host, and which exhibit peculiar magnetic 
properties: on one hand, the low dilution ensures that the locations of the ferromagnetic atoms (and 
therefore their interactions) are random, so that their structure presents no order; on the other hand, 
evidence is found at low temperature for a transition to a phase in which the local magnetization 
is frozen (in a direction variable from point to point) and which therefore displays some properties 
characteristic of the presence of order. The phenomenology of spin glasses is indeed very rich, and 
the development of theoretical models able to explain and reproduce it in full has been a major chal- 
lenge (and achievement) of statistical mechanics in the past thirty years, requiring the introduction 
of innovative concepts and techniques. 

During the same time span, a large number of interesting results have been obtained in the under- 
standing of combinatorial optimization problems, and in the development of computational complexity 
theory (see |2] for an introduction) . Combinatorial optimization problems (that is to say, problems in 
which the optimal configuration in a large and discrete set of candidates has to be found) are of the 
greatest interest for practical applications, and turn out to be general enough to deserve considerable 
attention from the theoretical point of view as well. Moreover, they are the cornerstone of complexity 
theory, the purpose of which is to characterize the intrinsic "hardness" of solving problems, and also 
the efficiency of algorithms used to solve them. 

Very early, it was recognized that these two fields, apparently far from each other, and studied 
by different communities of researchers, actually have very much in common. It was soon realized 
that random distributions of some well known and very important (both theoretically and in view 
of applications) combinatorial optimization problems were formally equivalent to diluted spin glass 
models, and could be treated with such powerful tools as the replica method and (somewhat later) 
the cavity approach. This has led, in the past decade, to a very fecund transfer of problems and ideas 
across the two fields, leading to significant advances in our understanding of both. 

A first area of interest is the characterization of the different phases of models that are relevant from 
both the optimization and statistical mechanics points of view. These models consist of a collection 
of A'^ Ising spins that interact with /c-body couplings (fc = 2, 3, . . . ) with random strenghts (the exact 
form of the interactions defines each model). The number of interactions to which individual spins 
participate, also called connectivity, plays the role of the control parameter, analogous to the pressure 
in thermodynamic systems. As the connectivity is varied, the free energy "landscape" undergoes a 
series of dramatic structural changes, that correspond to the onset different "macroscopic" properties 
of the system, such as the presence of an exponential (in A^) number of local minima in the landscape, 
or the value of the energy density of the global ground state being of order 1 rather than order A^. 
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Another area of interest is the analysis of the algorithms that can be used in order to solve the 
optimization problem represented by each model, or in equivalent but more physical terms, to find 
its ground state configurations. There is a huge variety in these algorithms: some of them define a 
dynamical process which modifies the configuration, in a manner similar to the well known Metropolis 
algorithm; some others perform a sequential assignment of the values of spins trying to minimize the 
number of positive contributions to the hamiltonian; others still do not act on the spins themselves, 
but rather on some effective variables, such as the magnetic field conjugated to each spin. Accordingly, 
a full "taxonomy" of algorithms can be constructed, and the average behaviour of whole classes of 
algorithms, with similar structure but different attributes, can be characterized, allowing both to 
identify those algorithms that are of interest from the point of view of actual applications, and also 
to reach a better understanding in the properties of the models themselves. 

Even though the list of the topics that have been studied in this field, and which are of interest 
for current research, includes many more, I shall limit the discussion to the previous ones: in this 
thesis, I have worked on problems that stem from these two lines of research. In the first Part, I 
shall therefore introduce the models I have studied and give an overview of the most relevant known 
results. Chapter 1 is an introduction to the physics of disordered systems: the main concepts of the 
statistical mechanics of spin glasses are introduced, with a discussion of the main phcnomcnological 
features that characterize them, and of the replica method, which allows to study them analytically. 
In Chapter 2, I shall introduce combinatorial optimization problems, and some important results from 
the theory of computational complexity that are relevant to my work; in particular, the two boolean 
satisfiability problems that I have been interested in, called fc-SAT and fc-xORSAT, are defined, and 
their properties are discussed. Finally, in Chapter 3 I shall review the results obtained by applying the 
methods and concepts developed for spin glasses to these two problems, and what their interpretation 
as spin glasses can teach us about the physics of these and similar systems. 

In the second Part of the thesis, I shall present some of the original results that I have obtained 
in collaboration with Remi Monasson, Giorgio Parisi and Francesco Zamponi. 

The first problem we have studied is motivated by a well known (but not as well understood) 
empirical observation: a large variety of systems present a phase in which the ground states form 
clusters and the spins are frozen; in this phase, no local search algorithm is capable to find the 
ground states in an efficient manner. In this context (and very losely speaking), clusters are sets 
of configurations which all have the ground state energy and which arc connected, while different 
clusters are well separated (two configurations are considered adjacent if they differ by a number of 
spins which is of order 1 in iV, connected if one can be reached from the other with a series of adjacent 
steps, and separated if this is not possible); a frozen spin is a spin that takes the same value in all the 
configurations of a cluster; a local search algorithm is an algorithm that only uses information about 
the values of a number of variables which is of order 1 in A^; and efficient means that the time (or 
number of elementary computations) required to find a ground state configuration with this algorithm 
grows faster than any polynomial in A^. 

The simplest model presenting such a "clustered-frozen" phase is fc-xORSAT, which I mentioned 
before and I shall discuss in Chapter 2; on the other hand, one of the most studied (and useful 
in practical applications) local algorithms is DPLL, which works by assigning variables in sequence 
according to some simple rule called heuristics, and which I shall also discuss in Chapter 2. In order to 
gain a better understanding of the failure of local search algorithms in the clustered-frozen phase, we 
have studied a farly general class of DPLL heuristics for fc-XORSAT, obtaining some results that I shall 
present in Chapter 4. Most notably, we have obtained the first proof (to the best of my knowledge) 
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that any heuristic in this class fails to find a ground state in polynomial time in with probability 
1 as iV goes to infinity. Moreover, we have obtained an argument that supports the claim that in the 
large k limit, one of the heuristics belonging to the class we have studied (and which was previously 
introduced and called GUC) is capable of finding ground states efficiently with probability 1 up to 
the onset of the clustered-frozen phase, while all the other heuristics previously studied were known 
to fail well before this phase transition. 

The second problem we have considered concerns the most studied and celebrated combinatorial 
optimization problem: k-SAT. There are many reasons motivating the interest for fc-SAT, notably 
that it is the first problem for which NP-completeness (which is the key concept in computational 
complexity theory) was proven, and that it is so general that a huge number of other problems (many 
of which are relevant in view of applications) can be expressed as particular instances of k-SAT. As 
a result of these extended studies, a very rich phase structure has emerged, with a multitude of 
transitions determined by temperature and connectivity. The aim of our work was to study the phase 
which is obtained at zero temperature when the connectivity goes to infinity. Apart from the intrinsic 
interest of studying one of the phases of the system, this problem is very interesting due to some recent 
results in computational complexity theory that establish a link between the average case complexity 
of k-SAT at large connectivity and the worst case complexity of several other problems. No relation 
between these two measures of complexity was previously known, and the complexity class of the 
problems considered depends on the properties of fc-SAT at large connectivity. 

The main result we have obtained is that this phase of the system is characterized by the presence 
of a single cluster of ground states in which the fraction of spins that are not frozen goes exponentially 
to as the connectivity is increased, and that the field conjugated to frozen spins is of the same order 
of the connectivity. I shall present these results in Chapter 5, together with a discussion of their 
interest and consequences for computational complexity theory. 

Moreover, during the past year I have engaged in the study of yet another algorithm for boolean 
satisfiability problems, going under the name of WalkSAT. This work, which consists in a numerical 
characterization of the average behavior of the algorithm, and in elucidating the properties of fc-SAT 
that this behavior imply, is still in progress, and will constitute the object of a future publication. 
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Chapter 1 

Statistical mechanics of disordered 
systems 

1.1 Statistical mechanics and phase transitions 

In this section I shall introduce some notation and briefly review some fundamental concepts of 
statistical mechanics, illustrating them with the example of the Ising ferromagnet. 

1.1.1 The Gibbs distribution 

A general system studied in statistical mechanics will have a large number N of degrees of freedom 
{xi £ X| i = 1, . . . , N}. A configuration '^^ G of the system is determined by specifying the value 
taken by each Xi. The hamiltonian of the system will be an extensive function of the configuration, 

The statistical properties of the system are determined by the probability distribution of the 
configurations. If a system can exchange energy with its surrounding at temperature0 T = this 
probability is given by the Gibbs distribution: 

1 e-/3ffW (1.1) 



' ' Z{I3) 

where the partition function Z{P) is a normalization. In fact, it is much more than a normalization, 
since all the equilibrium properties of the system can be computed from it. For example, the average 
moments of the energy are given by its derivatives: 

£;(/?) EE E[i/r^)] ^ _AiogZ(/?), (1.2) 

m{'^f]-mm? = ^iogz(/3), ... (1.3) 

The entropy and the free energy can be introduced in two equivalent ways. The "microcanonical" 
entropy is the logarithm of the number of configurations with energy E: 

Sn,iE) = log I {-if e : H{'^) = E}\. (1.4) 



shall always use "natural" units, in which the Boltzmann constant is equal to 1. 
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We can expect it to be an extensive quantity and define the entropy density Sm(e) = Stn{Ne)/N. Since 
the Gibbs measure depends on the configurations only through the energy, wc can greatly simplify the 
description of the system by considering the probability to find it in any configuration of energy E: 

{'^eX"\H{'€)=E} ^' ' ^' ' 

where we have introduced the free energy F^{E) = Nfm{E/N) = E — Si-n{E)/f3. 

On the other hand, the "canonical" entropy is defined in terms of the Gibbs distribution as 

SciP) = -EiiogPi-^^]] ^ -Y,n^nogP[^] , (1.6) 

(notice that P['^#'] depends on /3) while the free energy is defined 

J^c(/3)^--^logZ(/3). (L7) 



Notice that these definitions imply that 



log- 



Z(/3) 



/3i?(/3)-/3Fe(/3) (L8) 



which formally corresponds to the similar microcanonical relation. 

The relationship between the microcanonical and canonical approaches becomes evident in the 
thermodynamic limit N oo. In this limit, we can compute the canonical free energy with the 
Laplace method: 

fM = - lim ^4log^(/3) (1.9) 
= - lim ^ilog /de e-^''/-^^) (LIO) 

N-.oa N (3 ^ ^ ^ 

= /m(e) (L12) 

where e is the value that maximizes the exponent, i.e. /^(e) = <^ *m(^) = f^- -"^^^ ^^'^ thermo- 
dynamic limit the energy is concentrated, so that 

e{/3) = lim / de e — = e e-'3^"+^»(^")-'3/»(s) = g (1.13) 

N^caJ Z{(3) 

from ^ and Therefore /c(/3) = /m(e(/3)) and Sc(/3) = Sm(e(/3)). 

The physical interpretation of the free energy becomes clear by observing that p.5|) can be rewritten 

as 

p[e] = _i_e-JV/3/„(e) ^ g-Ar/3[/„(e)-/„(e(/3))] ^ ^--^ -^^^ 

i.e. the probability that e takes a value which is different from the expected value is exponentially 
small in N and the corresponding large deviations function is the free energy itself. 

Also notice that if the energy of a configuration '^i' only depends on some extensive observable O, 
i.e. -ff(^) = S{0{'^)) where <S is some function, then the expected value and the distribution of the 
large deviations of O can be expressed in a similar way in terms of the free energy, by writing it as a 
function oi o = O/N. 
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1.1.2 Phase transitions and ergodicity breaking 

Let us now discuss a specific example: the infinite range Ising ferromagnet. The degrees of freedom 
are Ising spins at e { — 1, 1}. We consider that each spin interacts with all the others and with a 
homogeneous external field h'^^*: 

1,N 

7J(<r) = - ^ JNcr^aJ - h'^'^'a, . (L15) 

i<j i 

In order for the energy to be extensive, Jn must scale with the number of spins as J/N, and we set 
the energy units so that the factor J is 1. 

It is easy to solve this model with the trick discussed in the last paragraph of the previous section: 
the energy (jl.lSp depends on the configuration only through the total magnetization M('^) = J^i '^ij 
which is an extensive quantity. In terms of densities 

e(m) = -im^ -/i'='<^'m. (L16) 

The number of configurations with magnetization M is just (^) where ~ {N + M)/2 is the 
number of up spins, so that the (microcanonical) entropy is obtained by Stirling's approximation: 

l + m l + 77il — m l — m 
s{m) = ^log^ ^log^— . (1.17) 

The equilibrium magnetization fh is obtained introducing /(m) = e(m) — s{m)/ [3 from the condition 
/'(m) = ^ -m - h-^' - ^[log(l + m) - log(l - m)] = , (1.18) 
from which the self-consistent equation 

m=:tanh[/3(77i + /!'=''*)] (1.19) 

is found. 

We see that for (3 > 1 this equation admits a solution with fh ^ Q even if = 0, i.e. there is a 
spontaneous magnetization, while for /3 < 1 this is not the case. This is one of the simplest examples 
of phase transition, in which the magnetization has the role of the order parameter characterizing the 
phases. Notice that the existence of a spontaneous magnetization is a very striking phenomenon: in 
the absence of an external field, the energy is an even function of the magnetization, and the Gibbs 
weight of the configurations with magnetization m is the same as that corresponding to magnetization 
— m, so that the expected value of the magnetization is at all temperatures. 

The solution of this apparent contradiction can be understood by a more careful consideration the 
free energy of the problem. In the absence of field, f{m) is an even function of m. It can be easily seen 
that the sign of /"(O) is the same as that of 1 — /3: at high temperature to = is the absolute minimum 
of /, while at low temperature / has two equal minima /(to-|_) = /(m_). In this line of reasoning, 
we are implicitly assuming that the external field is exactly when we take the thermodynamic limit. 
However, this is not a satisfactory assumption: the magnetic field is a physical parameter, while the 
thermodynamic limit is an idealization, so that the description of the physical ferromagnet should be 
obtained by considering a finite size system in the presence of a (possibly small) magnetic field, and 
computing the thermodynamic limit of the system in the presence of the field, which can then be 
taken to 0. The expected magnetization in the absence of field is then 

too= lim lim — E[M|/3, . (1.20) 
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As a consequence, the degeneracy between the two minima of / present when /? > 1 is removed 
before we take the hmit of zero field, and only one of the two minima will contribute to the Gibbs 
measure. Loosely speaking, in the presence of spontaneous magnetization mg > 0, in order to reach 
a configuration of magnetization m < the system must cross a free energy barrier of order 0{N), 
which cannot occur in the thermodynamic limit: the configuration space then breaks in two distinct 
regions, one containing all the configurations with positive magnetization and the other those with 
a negative one, and the two regions are dynamically disconnected. This is an example of ergodicity 
breaking (for a clarifying discussion of ergodicity breaking in magnetic systems, see Chapter 2 of [3]). 

A final remark concerning the nature of the phase transition. We can compute the magnetization 
as a function of the external field by looking at the positions of the minima of /. In the absence 
of field, when {3 ~ 1 + e the two minima are separated by a distance of order o(l) (as e — » 0), and 
the value of the spontaneous magnetization grows continuously from to a finite value with (3 — 1. 
However, a different situation can occur, in which at the critical temperature the free energy has two 
well separated minima, such that one is favored for /? = /3c + e and the other for j3 ^ j3c — e. In this 
case, when the temperature crosses the critical value, the order parameter undergoes a discontinuous 
change. This kind of discontinuous phase transitions is called of first order, while continuous ones are 
called of second order. 

1.2 Disordered systems and spin glasses 

Disorder is ubiquitous in nature: amorphous materials are infinitely more common than crystals; 
biological systems sometimes manifest order in the form of regular behavior, but rarely of structure; 
the distribution of matter in the universe is irregular at any scale... Countless more examples show 
that, in fact, disorder is the rule of nature, and order is the exception. 

However, the apparent lack of order and structure is not a sufficient criterion to consider a system 
as properly disordered. After all, a snapshot of the positions of molecules in a gas shows no sign of 
order, and yet gasses have a perfectly regular behavior under most conditions. On the other hand, a 
system as simple as a double pendulum can have an incredibly complicated dynamical evolution, with 
no signs of regularity at all, but would hardly be considered disordered. 

In this section I shall try to give some examples of systems in which disorder plays a crucial role 
in determining their behavior, and which can be understood in terms of some very general concepts, 
in order to obtain a better characterization of what "proper" disordered systems are. I shall also 
introduce a formalism that has proven extremely powerful to describe them in a quantitative way. 

1.2.1 Origins of disorder 

In general, a disordered system can be characterized as having two distinct sets of parameters. The 
first one corresponds to the degrees of freedom of the system that have a dynamical evolution during 
the observation of the system. The second set corresponds to some parameters that influence the 
dynamics of the degrees of freedom, but that do not change during the observation, and which have 
"random" or irregular values. 

In some cases the distinction between the two sets of variables will be purely dynamical. Glasses 
are a prototypical example of this kind of systems. They lack any long-range order, but locally the 
positions of atoms are very constrained. As a result, the motion of an atom typically requires the 
rearrangement of a number of neighbors that varies widely, and some degrees of freedom are effectively 
"frozen" over the experimental time scales, while others undergo a fast dynamical evolution. Another 
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example of this class of system is provided by kinetically constrained models, which are a simplification 
and generalization of glasses. These models generally study particles on lattices that undergo some 
simple dynamics, e.g. each site can be either empty or occupied by one particle, and particles can hop 
from one site to the next under some conditions that arc specific to the model and which typically 
include that the site be empty. Depending on the boundary conditions and on the specific dynamical 
rules a rich phenomenology can be produced. 

In other cases the distinction between dynamical variables and "frozen" parameters is explicit: 
some parameters (e.g. the interaction strength between pairs of particles) take constant random 
values, extracted from some known distribution. This kind of disorder is said to be quenchei^ The 
most celebrated example is that of magnetic impurities diluted in noble metal alloys, in which the 
positions of the impurities, and therefore the strengths of their magnetic interactions, are in fact 
random, giving rise to a very peculiar phenomenology. The theoretical models introduced to study 
these materials and to reproduce their behavior go under the name of spin glasses. The rest of 
this section will be devoted to introduce the most widely studied models of spin glasses, while their 
phenomenology and the analytical techniques used to solve them will be discussed in the latter sections 
of this Chapter. 

1.2.2 Spin glass models 

The simplest models for spin glasses has the following hamiltonian (for the classical introduction to 
the field, see [T]): 

Hj =J2J^3(^^<^k (1.21) 

i,j 

where the J = {Jij} are random couplings and a = {ffi} arc Ising spins. Depending on the geometry 
of the interaction, several models can be obtained: 

Edwards- Anderson (EA) — The interactions involve only nearest neighbors on a lattice of di- 
mension D, and their strengths are random variables extracted from a Gaussian distribution 
with zero average and finite variance. This was the first model introduced to describe magnetic 
alloys 

Sherrington-Kirkpatrick (SK) — Each Jij (for each distinct couple of indices) is extracted from 
a Gaussian distribution. In order for the energy to be extensive, the standard deviation of the 
distribution must be of order 0{N^^^^) [S]. 

Bethe lattice — The interactions between spins are described by a Bethe lattice (i.e. a random 
graph with a finite connectivity k and with no loops), and their strength has a standard deviation 
proportional to fc~^/^. 

A simple generalization is obtained by allowing the interaction to involve a number of spins p > 2: 

Hj = ^ JjiJ2...ipO-nO'j2 • ■ • o'ip ■ (1-22) 

In such p-spin models the spins can be either Ising or real (ct^ S R). In the latter case a spherical 
constraint af = 1 is imposed. Many more models have been proposed and studied, which I shall 
not describe. 

^Notice, however, that there is no fundamental difEcrcnce between quenched and dynamically induced disorder: in 
both cases, a large number of parameters is effectively frozen in random values. The difference is mainly related to the 
description, rather than the physics of the system. 
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1.2.3 Mean field theory and diluted models 

Even though the Edwards-Anderson model was the first spin glass model to be proposed, in 1975, it 
still waits for a general solution. In fact, most of the progress made in spin glasses has been obtained on 
the basis of mean field theory. Mean field theory can be defined as in the case of the Ising ferromagnet 
by writing the hamiltonian (|1.21|) in terms of local fields, 



and replacing the configuration-dependent value of hi with its thermal average, which depends on 
magnetizations rather than spin values. This approach can be generalized (and made much more 
powerful) by writing directly an expression for the free energy which depends on the local magneti- 
zations {rrii} and looking for the values of {rrii} that satisfy the set of equations df /drrii = 0, an 
approach that goes under the names of Thouless, Anderson and Palmer (TAP) [6]. However much 
care should be exercised in deriving the expression for the free energy, and it should be kept in mind 
that since this doesn't (usually) come from a variational principle, there is no requirement for the 
solutions to the TAP equations to be minima of the free energy. As we shall see, the mean field results 
can be derived in a more transparent, but more complicated, analytical way. 

A very important point to stress is that mean field results are in general exact for infinite range 
models, such as SK (and this has been recently rigorously proved), but are only approximations for 
large (but finite) range models, which become poor approximations if the range of interaction is short. 
This is due to the fact that in long range models, local fluctuations of thermodynamic quantities have 
no global effects, while in short range models they become crucial. However, finite range models have 
proven themselves very elusive so far. This raises the question of how to include local fluctuation 
effects in more tractable models. 

A step towards this direction is provided by diluted models, of which the Bcthe lattice model 
introduced in the previous subsection is an example. A more general case is obtained when the 
geometry of the model is an Erdos-Renyi random graph, in which each pair of spins has the same 
probability of being connected, and the average connectivity is finite. In these models, the corrections 
to mean field theory arise from loops, which are typically of length 0(log A^), and their magnitude is 
small and can be dealt with (as we shall see when I will introduce the cavity method). On the other 
hand, local fluctuations are present in diluted models, and they can be studied in this context. 

1.2.4 Frustration, local degeneracies, complexity 

A very general and important feature of the spin glass hamiltonian p.2ip is that its global minima, 
which govern the low temperature behavior of the system, cannot be found by local optimization. 
This fact has two causes, and very deep implications. 

The first cause is frustration, which can be most simply illustrated by an example: if J12, J13 > 
while J23 < there is no possible assignment of cti, cr2, <^3 that will make all three terms in Ji2(JiO'2 + 
Ji30'i0'3 + t/23C20'3 negative. Some of the addends in the hamiltonian will have to be positive, and the 
minimization of the hamiltonian requires a global approach. 

Also, once it is clear that some interactions will have to give positive contributions, it is also clear 
that a large number of choices arc possible for which terms to make positive: in general a large number 
of configurations will have the ground state energy density. But this local degeneracy, which is the 
second obstacle to local optimization, can occur independently of frustration. If we consider (only for 
the sake of this argument) an Ising p-spin model with large p and all the J's positive, we see that 




(1.23) 
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the number of assignments that minimize each term in the hamiltonian (separately) is 2^"^. Each 
many-spin interaction term poses a very weak constraint on the individual spins. 

The consequence of frustration and local degeneracy is that in general the ground state of a 
spin glass will be highly degenerate. Not only the number of minimal energy configurations will be 
exponential in the size of the system, but often, due to disorder, the Gibbs measure will decompose 
in a large number M of pure states. In some cases this number will be exponential: N ~ where 
S > is called complexity; in other cases M will be sub-exponential in A^, but still large. 

1.2.5 The order parameter of disordered systems 

The most striking feature of spin glasses is that there is order hidden in their disorder. If one looks at 
a "typical" configuration of a spin glass, it will look the same at any temperature: each spin points 
in an apparently random direction. However, as the temperature is lowered, each spin becomes more 
and more "frozen" in a particular direction, which will depend on the site and which will "look" 
as disordered as the typical high temperature configuration. At sufficiently low temperatures, even 
though the site-averaged magnetization is zero, the local average magnetization is not. A convenient 
measure of this hidden order was introduced by Edwards and Anderson [1], and goes under their 
names: 



where rrii is the thermal average of (Ji. In the following I shall denote thermal averages with angled 
brackets, e.g. rui = {(Ji). 

Of course, since the hamiltonian is dependent on the specific values of the random couplings, the 
value of rrii will also depend on them. However, for many physical observables the average over sites 
is equal to the average over disorder: 



where /i(J) is the distribution of disorder. Such observables are said to be self- averaging, and the 
Edwards- Anderson order parameter qea is one of them. On the other hand, if physically relevant 
observables were to be dependent on the realization of disorder, i.e. on the specific sample, there 
would be very little to say about them, and very little interest in their study. 

The Edwards- Anderson order parameter is very closely related to a more general quantity, the over- 
lap, which can be defined on two different contexts. The overlap between microscopic configurations 
a and r can be defined as 



which will be in the interval [—1,1]. The value 1 will correspond to perfectly correlated configurations, 
-1 to perfectly anti-correlated ones, and to uncorrelated a and r. The concept of overlap can 
be extended to thermodynamic states, and is particularly interesting in the presence of ergodicity 
breaking. If we consider two different thermodynamic states a and /3, we can compute 




(1.24) 




(1.25) 




(1.26) 




(1.27) 



which will measure how different the two states are. 
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When a single state is present, the Edwards- Anderson order parameter is just ^ea = laa, the 
self-overlap of the state with itself. However, in presence of crgodicity breaking, the Gibbs measure 
decomposes in a sum over pure states, 

(0> = E I e-''^(-) = ^ ^ ^ ± 0{a) e'^"^^^ = ^ (O). (1.28) 

where Za = So-ea ^-'^P(^^^(''^)) ^^'^ ""^^ = Za/Z is the relative weight of the state a in the decom- 
position. In this case, the Edwards- Anderson parameter is given by 



i i \ a / a,B 



= — E (^») — E I E I = E ^^^P ^"'3 (1-29) 

in which not just the self-overlaps of the states are considered, but also the overlaps among different 
states. 

A very powerful characterization of the structure of the thermodynamic states is provided by the 
distribution of overlaps between states. 



'Piq) ^^WaWp S{q - Qap) (1.30) 
a,l3 

which gives the probability that two configurations picked at random from the Gibbs distribution have 
overlap q. In terms of 'P{q) we will have 

9EA ^ J ^(9) 1- (1-31) 



1.3 Phenomenology of disordered systems 

As I have tried to explain in the previous section, disordered systems share three characteristic fea- 
tures: first, the presence of quenched disorder; second, the effects of frustration and local degeneracy, 
which lead to the existence of many thermodynamic states at low temperature; third, the "freezing" 
of the dynamical degrees of freedom in a disordered configuration at low temperature. From the 
phenomenological point of view, the two latter characteristics are the most relevant ones. 

In this section I shall briefly review the phenomenology of disordered systems that support this 
picture, and which is common to a very wide class of systems, regardless of the specificities of different 
models. 



1.3.1 Spin glass susceptibilities 

The first clear observation of a "hidden" order in disordered systems came from measures of the 
low-field AC magnetic susceptibility in diluted solutions of iron in gold. The magnetic susceptibility 
X is directly related to the Edwards- Anderson order parameter qea- It is defined locally as xu = 
dnii/dh'^'^^ , where /i™' is the applied external field. Since the contribution of the external field to the 
hamiltonian is always a linear term — hf^^'^i^ is easy to sec that the following fluctuation-response 
relation must hold: 

X.. = ^ = an^^ l°g m {hT}) - /? ((a. - {a^f) ^ /3(1 - ml) . (1.32) 
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Figure 1.1: Magnetic properties of spin glasses. Left The AC susceptibility of AuFe alloys at different 
Fe concentrations for low field (~ 5 G) and u = 155 Hz (from [7j). Right bottom The DC susceptibility 
of CuMn for two Mn concentrations. Curves (a) and (c) were obtained by cooling in the measure- 
ment field (FC),(b) and (d) are the results of zero-field-cooled (ZFC) experiments (from [5]). Right 
top Remanent magnetization in AuFe (from [^). 

The measured local susceptibility is the average of xu over the sites: 

i 

In the absence of magnetic ordering at low temperatures, xioc should diverge as 1/T. The mea- 
sured susceptibility shows a sharp cusp instead of a divergence, which indicates that below a certain 
temperature ^ea > (Fig. II. ip . 

A more detailed analysis of the frequency dependence of the measured AC susceptibility sug- 
gests the existence of a glassy magnetic phase, i.e. a phase characterized by the existence of many 
metastable states. This is clearly confirmed by measures of DC magnetic susceptibility and of re- 
manent magnetization, which both display a very strong dependence of the response on the details 
of the preparation of the sample. In DC susceptibility measures it can be seen that below a critical 
temperature, which coincides with the extrapolation to zero frequency of the position of the cusps 
in AC measurements, two different values of susceptibility can be measured: if the sample is cooled 
in the absence of field one obtains Xzfc, which is lower than Xfc the value which is obtained when 
the sample is cooled in the presence of field. Moreover, if the external field is strong, a "remanent" 
magnetization is observed after it is switched off. The value of the remanent magnetization again 
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Figure 1.2: Left Viscosity measures for many glass forming liquids (from [in])- The glass forming 
temperature Tg is reported in parenthesis in the legend for each liquid. Right Structural relaxation 
times from dielectric relaxation measurements (from jllj ) 



depends on whether the field was applied during the cooling of the sample or only later. In the first 
case, the so called Thcrmo-Remancnt Magnetization (TRM) is larger than the Isothermal Remanent 
Magnetization (IRM) (Fig. II. ip . This dependence on preparation of the sample properties clearly 
demonstrate that many different low temperature thermodynamic states are accessible to the system, 
and that they are well separated from each other, in the sense that the free energy barriers between 
states are extensive. 



1.3.2 Divergence of relaxation times 

The main characteristic of glassy behavior is the divergence of the relaxation time at finite temperature. 
For structural glasses, the relaxation time ta is defined as the decay time of density fluctuations, and 
it is accessible experimentally both directly and through the Maxwell relation 

T] = Goota (1.34) 

where 77 is the viscosity and Goo is the infinite-frequency shear modulus of the liquid. Experiments 
show that super-cooled liquids have a viscosity which can vary by as much as 15 orders of magnitude 
when the temperature varies by a factor of two above the glass forming temperature fFig. 11.2]) . Similar 
results are obtained from direct measurements. 

Spin glass models also show a divergence in relaxation times. A good example is provided by 
the p-spin spherical model (for p > 3). At high temperatures, the Fluctuation-Dissipation Theorem 
(FDT) holds, and the correlation G{t,t') is related to the response F{t,t') by the relation 

^^G{t,t')^-T F{t,t'). (1.35) 
If the system equilibrates, the correlation function becomes invariant under time translations, C{t,t + 
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Figure 1.3: Left The translationally invariant correlation function Ceq(T) as a function of t, for 
different temperatures T. The horizontal line is the value of qea- Right The out of equilibrium 
correlation function C(tv,,t + iw) as a function of t for different waiting times t^ at temperature 
T = 0.5. The dotted line is computed in the limit t^^ s- oo and the horizontal line is its limiting value 
for t ^ CO. Both figures are from [Hj . 

r) = Ccqir) and it is possible to derive a differential equation for Ccq^r), whose numerical solution 
for p = 3 is shown in figure 11.31 

What one sees is that as the temperature is decreased, a plateau forms. The length of the plateau 
diverges as T — > T^;. The analysis of the model shows that is the temperature at which the free 
energy becomes dominated by an exponential number of metastable states with energy higher than 
the ground state. The value of the plateau coincides with qba- 

1.3.3 Ageing 

If the temperature is lowered below Td, a striking break-down of the translational invariance of the 
correlation function occurs, signalling that the system becomes unable to equilibrate. In this regime, 
the correlation function C'{t^,t^ + t) depends separately on the waiting time <w and on the duration 
of the observation t. Only in the limit t„ oo the validity of the FDT is recovered and the system 
finally equilibrates. 

This is an example of a very general phenomenon, observed in structural glasses as well as in spin 
glasses, which goes under the name of ageing. Many observables for disordered systems maintain a 
time dependence for very long times under stable external conditions, indicating that they cannot 
equilibrate. This again confirms the existence of many metastable states which "trap" the dynamics 
of the system. 

1.4 The replica method 

In this section I shall briefly review one of the two equivalent analytical methods that can be used to 
investigate the equilibrium properties disordered systems: the replica method [T]. 

1.4.1 The replica trick 

As I mentioned in the second section of this chapter, many physically relevant quantities are self- 
averaging, which is to say that their thermodynamic average is independent on the specific sample. 
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A most notable example of a self-averaging quantity is the free energy density, 

/j(/3) = - lim -L^ log Zj{(3) (1.36) 

where the subscript J denotes the dependence on the disorder. Because of the self-averageness of /, 
the free energy of any sample will be the same, and will be equal to the average over the distribution 
of J of fj: 

fiP) = JAF) = - lim -^\ogZj{(3) ^ - hm j df,{J) log^ e-^^-(-) . (1.37) 

(T 

Unfortunately, the presence of the logarithm in the integral over the disorder makes it impossible to 
calculate it directly. However, one can use the following identity 



x" - 1 

logx = lim (1.38) 

n^O n 



and write 



Zj{p)^ - 1 ^. Zj{f3)^ - 1 



logZ7(/3) = lim — ^^-^ = lim = lim logZj(/3)" . (1.39) 



By doing this, instead of log Zj{/3) one has to compute Zj{(3)"-, which turns out to be much simpler. 
Notice that Zj{f3)" is the partition function of a system in which the dynamical degrees of freedom 
are replicated n times and the quenched parameters are the same in each replica (hence the name, 
replica trick). 

1.4.2 Solution of the p-spin spherical model 

As an example of the replica method, I am going to sketch its application to the p-spin spherical 
model. The hamiltonian is given by 

H.j{<j)= J,;,...,^ a,, •••a,^. (1.40) 

ii ,....ip 

The disorder J has a gaussian distribution with average 0, and in order for the hamiltonian to be 
extensive its variance must scale as N^^^: 



P[J.,....^=J]=M^) = W— -p exp --J2— — ^. (1.41) 



27rp! I 2 pi 

The starting point is to compute the gaussian integral over the gaussian distribution of disorder; 

ii---ip I a=l ) 

= n / • • • -p I E « • • • « I (1-43) 

il---ip a,b J 

where I have dropped an overall normalization constant which doesn't give an extensive contribution. 
Here and in the following, I shall always denote hy i,j,k, . . . site indices running from 1 to iV and 
with a, 6, c, . . . replica indices running from 1 to n. Notice that after the integral we are left with a 
system in which sites are independent and replicas are coupled, which we can rewrite: 



ZAP)- = I d'^'-'-d'^'' --p{jm^T.(T.<-'] } (1-44) 



4iVP- 

■ b 
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We can now introduce the overlaps between replicas, 

Q-b = ^Y.^>' (1-45) 



and multiply by 



1 = dQab I dXab CXp < iXab 



(1.46) 



to obtain: 



j da''--- da^ JdQ J dX expl!^Y.Qab + NYl ^abQab - E E '^^^-"^^ } 

\ ab ab i ab ) 



(1.47) 

(where Q = {Qab} and A = {Xab})- This integral is now gaussian in cr, and can be performed to 
obtain: 

Zj(/3)" = j dQ dAe-^'^('3,A) (^ 48) 

where the action is ^ 

5(g, A) = ^ Ql, - ^-^Q-" + I logdct(2A) . (1.49) 

ab ab 

This integral can be done using the Laplace method, in order to obtain 

f = - lim -^7 lim - log / dQ dAe^^'^^'^^A) ^ „ j^j^ - lim -L logg-^^^'^.A) ^ 2-S(Q,X) 

(1.50) 

where Q and A extremize the action. Notice however that we had to invert the order in which the 
limits over N and n are taken, which is not a priori a legitimate manipulation. Assuming it to be 
correct, the saddle point equations one obtains are the following: 

Xab = ^(Q"')ab, (1.51) 

^ = 0=^Ql^' + {Q~\b (1.52) 

As we see, the parameter space over which one has to minimize / is the space of symmetric 
matrices Q. The dimension of these matrices is n, which is assumed to go to 0: the only way to 
obtain a meaningful result is to write an expression for / which is valid for any finite n and then do 
an analytic continuation of this expression for n ^ 0. However, this requires that the matrix Qab be 
parameterized in such a way that the matrix elements will depend on n and on a fixed number r of 
parameters {pi,P2, . . • ,Pr}, which will be set to the values that satisfy the saddle point equations and 
which will be functions of n. 

This rather intricate procedure raises three issues. The first is related to the fact that the whole 
procedure is far from rigorous from the mathematical point of view. Second, the parameterization of 
Q in the particular form I've described limits the scope for the extremalization of /: it is not at all 
clear a priori that the absolute extremum of / corresponds to a matrix of the "right" form, and we 
may end up with an extremum that is not the "true" one. Finally, the stability of the free energy 
which is obtained in the end should be carefully checked a posteriori. I shall return on these issues 
later. 

A "naive" hypothesis would be to assume that since the replicas are just a formal expedient to 
compute /, the physical quantities should be independent of the replica index, and the overlap matrix 
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Q should be invariant under permutations of the replica indices. This would lead to the very simple 
parameterization Qat = 90 + (1 — qo)^ab (the diagonal elements of Q are determined by the spherical 
constraint to be 1). However, as already noted, the replicas do have a physical interpretation: the 
replicated partition function, which is the proper self-averaging quantity to compute, corresponds to a 
composite system consisting of n replicas of the original one. There is no reason why, in the presence 
of many states, different replicas should find themselves in the same state. Quite on the contrary, 
one should expect the breaking of the replica symmetry to be the signature of the presence of many 
states. It turns out that this intuition is correct. 

The solution of the p-spin model [12j can be obtained by applying the replica-symmetry breaking 
(RSB) scheme introduced by Parisi to solve the SK model [TSl [TBI [T7] . The following parameterization 
is assumed for Q: 

Qab = Sab + <7i(l - Sab)'^ (« ^ "7, = b^ m) + qqI {a ^ m ^ b ^ m) (1.53) 

where the free parameters are {m, go, 91}, a^m represents the integer division of a by m, and I(event) 
is the indicator function of event (i.e. it is 1 if event is true and otherwise). The parameters are 
subject to the conditions < m < n, with m such that n is a multiple of m, and < (70 < 91 < 1- 
This parameterization corresponds to a matrix Q which is made of n/m identical blocks of size m 
covering the main diagonal, with 1 on the main diagonal and qi outside of it in each block, and go 
outside the blocks (notice that the case m ~ n and go = 9i would correspond to the replica symmetric 
solution). This parameterization is known as one-step replica- symmetry breaking, or IRSB for short. 

If this parameterization is substituted in the expression of the action S{Q, A) (|1.49p . and the limit 
n — > is computed, the following expression for the free energy is obtained: 

/iRSB = + {m-l)q^-7rLqP]- 5-log(l-gi)+ 

+ - log [TO(gi - go) + (1 - gi)] + — -) . (1.54) 

m TO(gi - go) - (1 - gi) J 

This expression can then be minimized to obtain the values of m, gi and go. What one sees is that for 
high temperature, a solution with m = 1 exists and is stable. However (for p > 3) as the temperature 
is lowered to the solution with to = 1 becomes unstable and a new solution with to < 1 appears, 
which is stable and has a lower free energy than the solution with to = 1. The value of m undergoes 
a discontinuity as T crosses T,, jumping from 1 to a value which is at a finite distance from 1. As I 
have already mentioned, the existence of a replica-symmetry breaking solution is the signature of a 
glassy phase in which many different thermodynamic states coexist. The p-spin model undergoes a 
phase transition at from a paramagnetic to a glassy phase. 

I would like to conclude this section with three remarks. The first concerns the issues I mentioned 
regarding the validity of the replica method. As I wrote, in general the procedure is not mathematically 
rigorous. However, one should note that in the case of the SK model the Parisi solution has been 
recently proved to be exact. Moreover the method has been applied to a large number of fairly different 
models, and in each case the results obtained are sensible: it appears safe to conjecture its validity, 
with the proviso that the stability of the solution it gives should be checked a posteriori and that one 
cannot rule out the existence of other solutions, possibly with lower free energy. 

Second, the example of the p-spin is particularly simple. In other models, including SK, one needs 
to consider a more complicated parameterization of the overlap matrix, which consists in applying 
the procedure I described recursively: one starts with a "block" of size nii which has 1 on the main 
diagonal and qi outside of it, and introduces blocks of size to^+i = rui ^ p (for some integer p) on 
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the diagonal, with the same structure as the starting block, but a new value for the off-diagonal 
elements. This procedure can be repeated for any number of steps. The solution of the p-spin is 
one-step replica- symmetry breaking^ denoted IRSB. In the case of the SK model one needs an infinite 
number of steps, and the solution is said to be full replica- symmetry breaking (FRSB). 

Finally, the parameters over which one needs to cxtrcmizc the free energy are the matrix elements 
of Q (through the Parisi parameterization), which are scalar quantities. This is a general feature 
of fully-connected models. However, as we shall see in the following section, the parameters to be 
minimized become much more complicated in the case of diluted models. 



1.4.3 Replica formalism for diluted models 

In order to apply the replica method to diluted systems, one needs to generalize the approach that I 
have outlined for the case of the p-spin [THl UHl HH [22] ■ The starting point is the same: the average 
over disorder of the n-replicated partition function. For a system of Ising spins cr, G {^Ij l}i with 
(T = {fTi, . . . , cTjv} and with hamiltonian Hj{a), we have: 



Zj{Pr = J dM^)^---5]exp|-/3f]iJ^(a'')| =^...^exp|-/3f]i/,;(a-) 



(1.55) 



where cr° is the A^-spin configuration of the a'^ replica. In fact, the n-replicated spin configuration is 
a matrix cr with TV rows corresponding to the sites and n columns corresponding to the replicas. The 
i'h row is the n-component vector a, in which the component erf is the value of the spin on the site i 
for replica a, and the a*'^ column is the iV-component configuration of replica a. 

As an example of hamiltonian, we can consider the diluted version of the Ising p-spin model, which 
we shall discuss more in detail in the following: 

M ^ 
m— 1 

where the sum is over M terms, each consisting of the product of p spin, with indices i™ with 
j ~ 1, . . . ,p selected uniformly at random between 1 and A^, and where the couplings Jm are ±1 
uniformly at random. The additive constant present in each term of the sum is such that the energy 
is positive or null. The factor 1/2 is such that the value of the energy is equal to the number of terms 
in the sum which have a J„, with a different sign relative to the product of the spins. On a random 
configuration, half the terms will be equal to 1 and the other half to 0, so that the energy will be 
extensive if A/ ^ 0{N). 

We can interpret the right hand side of (|1.55p as the partition function of an effective hamiltonian 
depending on the full replicated configuration cr: 

Z7(:5)^=^exp{-/3^(<T)} (1.57) 

cr 

Since the distribution of disorder is independent on the site, the averaged quantity in the right hand 
side of p.55p must be invariant under permutations of site indices. This implies that the effective 
hamiltonian (|1.57p can depend on cr only through 



18 



CHAPTER 1. STATISTICAL MECHANICS OF DISORDERED SYSTEMS 



which is the fraction of sites that have repUcated configuration f . Even though c{f) actually depends 
on the replicated configuration cr, we are going to assume it to be fixed and avoid its appearance in 
the notation. Also, notice that X]f'^(^) ~ ^■ 

The overlap between replica configurations Qab can also be expressed in terms of c((t): 

1 



(1.59) 



This was to be expected: in the calculation for the p-spin, the free energy we obtained depended only 
on Q, and p.59p implies that what we obtained was actually dependent on c(t) only. This is a general 
feature of fully connected models: their free energies (or rather, the actions whose extrema are equal 
to the free energy) depend only on the overlaps between replicas. However, for diluted models one 
needs to generalize ()1.59|) to include higher moments: 



Qai ■ 



c(f) 



(1.60) 



The crucial point is that even though these quantities are more complicated than the overlaps, they 
are still conceptually equivalent to c(t), which provides the full description of the structure of the 
states of the system, be it fully connected or diluted. 

To see more in details how it is possible to write the free energy in terms of c(t), we can go back 
to (fT37)) where we recall that Jf'(cr) = [c(f)]: 



E 

{c{r)} 



Ur[NciT)]\ 



-0je[c{f)] J 



C(f) = 1 



(1.61) 



where the sum is over 2" variables, each variable being the value of c for one of the possible 2" n- 
componcnt spin configurations, that take values between and A^, where the multinomial factor is 
just the number of replicated configurations cr that give rise to the same distribution c(r), and where 
the last indicator function ensures the normalization of c(r). 

In the limit N ^ oo the sum becomes an integral and the multinomial coefficient can be approxi- 
mated with Stirling's formula to obtain 



1 



1- 



/(/3) = - lim _lim -Zj(/3)» 

N^oc pl\ ri->0 n 



(1.62) 



— lim — lim — — 

n^O n N^oo pN jQ 



lldcil 



exp < N 



c(t) log c(f) -/3^[c(t)] 



xl 



lim —4 extremum < c(f) log c(f) + /3Jf [c(7 



(1.63) 



(1.64) 



where (as before) we have exchanged the order of the limits N —>■ oo and n — > 0. 

With this formalism, the problem of computing the free energy of a (possibly diluted) disordered 
Ising model is decomposed into three tasks: 

1. Find the effective hamiltonian J^[c(f)] 

2. Compute, for each value of the extremum of the free energy functional in c(t) appearing on 
the right hand side of p.64p 
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3. Perform the analytic continuation of the resuh to n = 

In Chapter [5] I shall use this formalism to derive some properties of the solutions of an optimization 
problem which is formally equivalent to a diluted Ising spin glass. 
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Chapter 2 



Optimization problems and 
algorithms 

In the previous Chapter, I have given a very brief overview of the physics of disordered systems. In 
this Chapter, I shall introduce a different kind of disordered systems, which arise from the study of 
combinatorial optimization problems, and I shall discuss some aspects specific to them, and what they 
have in common with the disordered systems studied in physics. 

In the first Section, I shall give some examples of combinatorial optimization problems; in Section 
12.21 1 shall introduce the two specific problems that have been the subject of my research, fc-SAT and 
/c-XORSAT; then I shall introduce some notions from complexity theory, in Section [^751 finally, in 12. 41 1 
shall present some families of algorithm that are useful for finding solutions to optimization problems, 
and whose properties also shed some light on the underlying structure of the problems themselves. 

Most of the material discussed in this Chapter can be found in [2] . 

2.1 Some examples of combinatorial optimization problems 

Optimization problems are concerned with finding the "best" (or optimal) allocation of finite resources 
to achieve some purpose. It is clearly a very general and important class of problems. An early example 
of optimization problem is narrated in Virgil's Aeneid: Dido, a Phenician princess, is obliged to flee 
Tyre, her hometown, after her husband is murdered by her brother, a cruel tyrant. She embarks with 
a small group of refugees, and lands in Lybia, where she asks the king larbas to purchase some land 
to found a new city, Carthage. larbas, in love with Dido but rejected by her, has no intention to 
allow the settlement, and offers only as much land as can be enclosed in a bull's hide. He is, however, 
outwitted by Dido, who cuts the hide in thin stripes, which she joins to form a long string. With that, 
she encloses an area shaped as a semi-circle, delimited by the sea, and sufficient to build Carthage. 
In this legendary tale. Dido not only had the brilliant idea of cutting the hide, but also solved a 
non-trivial optimization problem: what is the curve of given perimeter that encloses the largest area? 

Combinatorial optimization problems are, in a way, simpler: the set of possible solutions is discrete. 
This restriction might appear severe in view of practical applications, but in fact it is not: many 
resources, such as industrial machines, skilled workers or computer chips are indeed indivisible. Let 
us begin with an example, which I shall use to illustrate a general, formal definition, and after which 
I shall give some more examples of different families of combinatorial optimization problems. 
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Consider the following 

Knapsack Problem (KP) Given a set S of items i = 1, . . . ,N, each having a value Vi G and a 
weight Wi G R+, what is the subset S' C S with the largest total value V = J2ies' ^'^"^ such 
that the total weight W = J2ies' m W < W* for some given W*? 

The possible solutions (or configurations) are all the subsets that can be formed with elements from 
>S, which are a discrete set of cardinality 2^ (corresponding to the two choices "present" or "not- 
present" for each item in 5). A specific instance of the general problem is defined by the pairs 
{{vi, Wi), i = 1, . . . , N}, and by the maximum allowed weight W* . 

In general, an instance of the problems I shall consider will be defined by specifying the following 
three characteristics: 

1. A set C of possible configurations 

2. A cost function F : C ^ M. that associates a cost i^(^) to every configuration G C, and which 
can be computed in polynomial time; 

3. An objective, that is to say a condition on F{^^) which must be satisfied. 
In the knapsack example, C is the set of all subsets of S, the cost function F is 



F{^) = 



x^z;, (2.1) 



and the objective is of the form F{^) > F* . 

In general, for a given instance, one can ask the following questions: 

Decision Does a configuration that realizes the objective exist? 

Optimization What is the "tightest" objective which can be realized? For example, the largest value 
of F*. 

Search Which configuration realizes the objective? 
Enumeration How many configurations realize the objective? 

Approximation Which configuration realizes a weaker form of the objective, for example F{'^) > 
7F* for some constant 7 < 1? 

The knapsack example above is a combination of an optimization problem (finding the largest possible 
value which can be realized) and a solution one (finding the corresponding configuration). Of course, 
one could ask many more questions. These are just the ones I shall be interested in in the following. 
Let me cite a few more examples of problems: 

Number Partitioning Given a set of positive integers S = {n; G N, i = 1, . . . , N}, find a subset 



S' C S such that J2ies' ^ 



n, 



Subset Sum Given a positive integer K and a set of N positive integers S = {ui G N, i = 1, . . . , N}, 
find a subset S' C S such that J2i£S' ^ ^■ 

Integer Linear Programming (ILP) Given a n-component real vector c, a, n x m real matrix 
A, and a m-component real vector b, find a n-component vector x with non-negative integer 
components and which maximizes c • x subject to the constraints Ax < b. 
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Is Prime Given a positive integer N, determine if N is prime. 

Many combinatorial optimization problems are defined on graphs. A graph Q is a, double set of 
points, called vertices, v € V, and of distinct segments connecting pairs of points in V, called edges, 
e £ £: Q = (V,f). Three special kinds of graphs are cycles, i.e. loops; trees, which are connected 
graphs that contain no cycles; and bipartite graphs, in which the set of vertices is divided in two, 
V = Vi U V2; and all edges have an endpoint in Vi and the other in V2. Let me just mention a few 
important problems defined on graphs: 

Hamiltonian Cycle (HC) Given a graph Q — {y,£), find a cycle Q' <Z Q containing all the vertices 

Traveling Salesman Problem (TSP) Given a graph Q = {'^t^) and a weight w{e) G associ- 
ated to each edge, find a HC with minimum total weight. 

Minimum Spanning Tree (MST) Given a graph Q = (V, £) and a weight w(e) G M+ associated 
to each edge, find a tree Q' C G containing all the vertices of Q with minimum total weight. 

Vertex covering (VC) Given a graph Q = (V,£), find a subset V" C of the vertices of Q such 
that each edge e £ £ has at least one of its endpoints in V', and minimizing |V"|. 

q-Coloring (q-COL) Given a graph G = {V,£), assign to each vertex a color c G {1, 2, 3, . . . , g} 
such that no edge in V has two endpoints of the same color. 

Matching Given a graph Q = (V, £) and a weight w{e) G K"*" associated to each edge, find a subgraph 
G' C Q such that each vertex in V' has one and only one edge in £', and which maximizes the 
total weight. Often Q is bipartite, in which case the problem is called bipartite matching. 

Melx Clique Given a graph Q = (V, £) , find its largest clique, i.e. fully connected subgraph. 

Min (or Max) Cut Given a graph Q = (V, £) and a weight w{e) G associated to each edge, find 
a partition (Vi, V2) of V such that the total weight of the edges that have an edge in Vi and the 
other in V2 is minimized (or maximized). 

All these problems are interesting from the theoretical point of view, and relevant for their practical 
applications. A further family of problems concerns boolean satisfiability, which I shall introduce in 
the next Section. The importance of boolean satisfiability problems and their connection to the other 
problems will be discussed in Section [273] 

2.2 Boolean satisfiability: /c-sat and /c-xorsat 

Boolean satisfiability problems are concerned with the following general question: given a boolean 
function J-{x) over N boolean variables x = (xi, . . . ,xiy) G {true, false}^, is there an assignment 
of the variables which makes the function evaluate to true? The different problems of the family 
correspond to specific choices of the form of the function J-. 

2.2.1 Introduction to fc-sat 

The prototype of satisfiability problems is the following. Given a A^-tuple of boolean variables x = 
{xi, . . . ,xn), a literal is defined as a variable or its negation, e.g. 2:3 and xj; a k-clause (or simply 
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clause, of length k) is defined as the disjunction of k literals, e.g. for fc = 3: 2:2 V ^4 V x^; finally, a 
formula is defined as the conjunction of M clauses. For example, for iV = 7, M = 3: 

jc-(x) = (ig V V xe) A (x2 V X3) A {xi V X3 V X5 V X7) . (2.2) 

Such a formula is said to be in conjunctive normal form (CNF), which is defined as 

M 

T{x) = /\ 

m=0 

where I„i and are subsets of {1, ... , N} such that /,„ n = for each to = 1, ... , M. 

The satisfiability problem (sat) is the problem of determining if a given CNF formula admits at 
least one satisfactory assignment (also called a solution) or not. An interesting special case is that 
in which all the clauses have the same length k, in which case the problem is known as A:-SAT. If 
the answer is "yes" , the formula is said to be satisfiable, which I shall denote by SAiQ, otherwise it is 
unsatisfiable which I shall denote unsat. 

The same questions apply to fc-SAT as to any other combinatorial optimization problem, namely 
the decision, optimization, solution, enumeration, and approximation problems, where the quantity 
to be minimized is the number of violated clauses. 

A lot of attention has been devoted to fc-SAT, principally for three reasons: first, for its theoretical 
relevance; many problems, from theorem proving procedures in propositional logic (the original mo- 
tivation for /c-SAt), to learning models in artificial intelligence, to inference and data analysis, can all 
be expressed as CNF formulae. Second, because it is directly involved in a large number of practical 
problems, from VLSI circuits design to cryptography, from scheduling to communication protocols, all 
of which actually require solving or optimizing real instances of fc-SAT formulae. Third, and probably 
most notably, because of its central role in complexity theory, which I shall discuss in the next Section. 

The questions of interest in the study of fc-SAT can be divided in two broad families: on one hand 
those regarding the general properties of CNF formulae and of their solutions (when they exist); on 
the other hand, those concerning the algorithms capable of answering the different questions one may 
ask (decision, optimization, . . . ); and of course, the intersection of the two (for example, proving that 
a certain algorithm succeeds in finding a solution under some assumptions also proves that a formula 
verifying those same assumptions must be sat). 

Also the answers that one can seek can be divided in two (or rather, their qualitative types): on 
one hand the results that are true in general and for any instance of fc-SAT (under certain conditions), 
and on the other hand results that are true in a probabilistic way. Let me clarify this last case with 
an example. Suppose one considers the ensemble of all possible fc-SAT formulae with given N and M, 
with uniform weight. The total number A/c of fc-clauses that one can form with TV variables is given 
by the number of choices of k among N indices times the number of choices for the fc negations, i.e. 

Afc^l^^y". (2.4) 

The number of formula; A/p that can be made with fc independently chosen clauses is then 

A/'f - (AAc)*^ . (2.5) 

Consider now a clause C in the formula, for simplicity C = xi V • • • Vx^. This clause will be satisfied by 
any of the 2'^ possible values of {xi, . . . ,Xk) except the one corresponding to Xi = false for z = 1, . . . , fc: 

^The use of SAT to designate both the general satisfiabiUty problem and the satisfiable property of a formula should 
not lead to confusion, since in the future I shall be concerned exclusively with fc-SAT. 
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out of all the possible assignments, only a fraction 1 — 1/2*'' will satisfy any given clause. Since the 
formula contains M = aN clauses (where a is defined as the ratio M/N), the average number of 
satisfying assignments will be 



A/'s = 2-x(l-l 



M 



N 



(2.6) 



If we consider large formulae, i.e. the limit N oo, we see that the average number of solutions tends 
to if 

log 2 

">"log(l-2-)- (2-^) 

Notice that the average number of solutions is larger than or equal to the probability that a formula 
is SAT. since 



N 



2™ 2 

A/s = n X P[The number of solutions is n] > P[The number of solutions is n] (2.8) 

n—Q n—1 

and the sum on the right hand side is the probability that a formula is SAT. Therefore, we see that in 
the limit iV — > oo a random fc-SAT formula chosen with uniform weight among all those with M — aN 
clauses is UNSAT with probability 1 if a > — log2/log(l — 2^*^). 

This kind of statement is very useful to characterize the typical properties of fc-SAT formula? under 
some given conditions. In many cases, the typical behavior is the interesting one, as it dominates the 
observable phenomena. The problem of studying fc-SAT formula? extracted from some distribution is 
often called Random-fc-SAT. If the distribution is not specified, the uniform one is assumed. 

Many interesting properties are easily proved for Random-fc-SAT. For example, for a ^ the 
probability Psat(Q;) that a random formula is SAT tends to 1. And it must be a decreasing function 
of a, since the property of being SAT is monotone: in order for a formula to be SAT, any sub- formula 
(made with a subset of its clauses) has to be satisfiable as well. In other words, adding clauses to a 
formula can only decrease its chances of being SAT, and adding random clauses to a random formula 
can only decrease its probability of being SAT. 

From the physicist's point of view, probabilistic results are most interesting, because a random 
distribution of formula? can be treated as a disordered system with some distribution of disorder. 
Indeed, one can represent Random-fc-SAT as a spin glass. Each variable Xi will correspond to an Ising 
spin (T;, which will be 1 if = true and —1 otherwise. For a given configuration, the number of 
violated clauses will play the role of the energy: 

^(^)=En — ^^ (2-9) 

"1=1 i=i 

where ij* is the index of the j**^ variable appearing in the m*'' clause, and JJ" is 1 if the variable 
appears negated and —1 otherwise. The set of {^j"} and {^J*} defines some random couplings which 
involve terms with l,2,...,fc spins, have unit strength, and are attractive or repulsive with equal 
probability. As usual with statistical mechanics systems, we shall be interested in the thermodynamic 
limit iV — > oo. Since a random configuration violates a random clause with probability 2"*^, the energy 
is extensive (i.e. proportional to TV) if a is of order 0(1) as — > oo. This is a perfectly legitimate 
diluted spin glass model. In fact, in Chapters [3] and [5] I shall present some results on Random-fc-SAT 
obtained applying the replica method of Paragraph 1 1 .4 . 31 to 
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2.2.2 Introduction to A;-xorsat 

Another interesting boolean satisfiability problem goes under the name of fc-xORSAT, and is obtained 
when the boolean function J-'{x) is the boolean equivalent of a linear system of equations: 

M 

Hx) = /\ 

m—l 

where the symbol © denotes the logical operation XOR, and where i"' S {!,..., A^} for m — 1, . . . , M 
and j = 1, . . . , fc are some variable indices, and where y = (yi, . . . , j/m) is some constant boolean 
vector. If we make the correspondence true = 1 and false = 0, this formula is equivalent to the 
linear system 

® ■ • ■ © x,i = yi , 

1 k 

Xi2 (B ■ ■ ■ ® x,2 = 2/2 , 

(2.11) 

^ x^M © • • • © x^M = yM ■ 

An immediate consequence of this remark is that a very efficient algorithm is available to find if 
a given /c-xORSAT formula is SAT, which assignments are solutions, and what is their number: the 
Gauss elimination procedure. One may even wonder why such a problem is interesting at all, given 
that it is equivalent to linear boolean algebra. The reasons are threefold: first, fc-xORSAT is less easy 
that it seems. For example, if one determines with the Gauss elimination procedure that a fc-xORSAT 
instance is not satisfiable, he could be interested in finding an approximate optimal configuration, i.e. 
an assignments which is guaranteed to satisfy a fraction 1 — e of the maximum possible number of 
clauses, for some given e > 0. Such an approximation algorithm, however, is not known (or rather, 
no such algorithm is known to work efficiently^ the meaning of which will become clear in the next 
Section). Second, many questions regarding the dynamics of algorithms that can be applied to both 
fc-SAT and /c-XORSAT are interesting, difficult to answer for fc-SAT, more manageable for /e-xORSAT, 
and a priori should have at least qualitatively similar answers for the two problems. In these cases, 
fc-xORSAT constitutes an excellent starting point to understand what happens in fc-SAT. Finally, and 
foremost from the point of view of physicists, because fc-xORSAT is a legitimate, and very interesting, 
spin glass model in its own. In fact, the diluted Ising p-spin model with couplings ±1 is fc-xORSAT: 
defining the energy as the number of violated clauses (as for /c-SAt) and using the correspondence 
between boolean variables and Ising spins, we have 

M -, _ 

E{a)=Y. (2.12) 

m— 1 

As in the case of fc-SAT, the spin glass model is defined for some distribution of disorder, corre- 
sponding to an ensemble of possible fc-XORSAT formulae with a given measure, and we shall consider 
the thermodynamic limit iV — > oo with some finite a = M/N. 

2.3 Computational complexity 

Introducing fc-xORSAT, I made the following implicit statement: that since an efficient algorithm for 
solving it was known, it could possibly be regarded as a less interesting problem than fc-SAT. Is such 
a statement reasonable? Not really: whether a problem is "harder" than another or not should be an 
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intrinsic property of the problem, if it is meaningful at all. and should not be related to our knowledge 
(or lack thereof) of algorithms. 

The question of what makes a problem intrinsically "hard" , and how to compare the "hardness" of 
different problems without introducing contingent dependencies (on the techniques and tools actually 
available to solve them) is the subject of computational complexity theory. It is a branch or rigorous 
mathematics, and it involves highly abstract (and quite complicated) models of computation. With 
no pretense in this direction, I shall only aim at giving the "flavor" of the most relevant concepts and 
results. An excellent (rigorous) introduction to the field is provided by the already cited reference [2]- 

2.3.1 Algorithms and computational resources 

The first issue to be addressed is how to measure computational complexity. Let us consider that 
we have some decision problem, and an algorithm which can solve any instance of the problem. In 
order to compute the solution to the problem, the algorithm will use some computational resources. 
The most important of them is the time it will take to complete the computation. Other examples 
are the memory required to store the intermediate steps of the computation (usually referred to as 
space); some algorithms are probabilistic (we shall discuss them later), and require a supply of random 
numbers; in order to save space, some intermediate results may have to be erased, which has an energy 
cost (the loss of information corresponds to a decrease in entropy) . There are several other relevant 
resources that one can consider. However, I shall consider only time. 

In order to eliminate the dependency of the running time on such practical aspects as the hardware 
used to perform the computation or the actual code used to implement the algorithm, time will be 
defined as the number of elementary operations (such as arithmetic operations on single digit numbers, 
or comparisons between bits, et ccetera) needed to complete the calculation. This will depend on the 
particular instance of the problem considered, and general results are obtained considering the worst 
possible instance for any given size n of the problem, and then taking the asymptotic behavior for 
large n. For example, if two different algorithms are available to solve the same problem, with times 
that scale as ti ^ O(n^) and t2 ~ 0{7i^ \ogn) respectively, then for large enough n it is sure that 
algorithm 1 will perform better than algorithm 2, regardless of the details of the dependency of t on 
n, and therefore of the specificities of the implementation. 

Clearly, the main theoretical distinction will be between algorithm that have running times that 
increase as polynomials of the input size, and algorithms for which t increases as an exponential of 
the input size. This is easily seen by considering what happens to the "accessible" size of the input 
if the speed at which elementary operations are performed is increased by some constant factor, for 
different scaling behaviors of t versus n. This is done in Table 12.11 Notice, however, that in practice 
an algorithm running in time scaling as lO'^n^ will take much longer than one scaling as 2^" ^" for n 
up to ~ 10"*^. The point is that in the analysis of known algorithms, such "extreme" coefficients never 
occur. 

2.3.2 Computation models and complexity classes 

The analysis of algorithms provides (constructive) upper bounds on the computational resources 
required by the algorithm to solve some problem. A more interesting (and challenging) question would 
be to find some lower bound on the resources needed to perform some computation, independently on 
the algorithm used, which would then be a property of the problem itself. The theory of computational 
complexity tries to answer this question. 
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t 


na(l) 


na(lOO) 


71a(10000) 


0{n) 


ni 


100 X ni 


10000 X m 


0{r?) 


«2 


10 X 7l2 


100 X 712 


0{n^) 


"3 


4.6 X ri3 


21.5 X 7i3 


0(2") 


714 


^4 + 6.6 


714 + 13.3 


0(22") 


"5 


^5 + 3.3 


715 + 6.6 



Table 2.1: Increase of the "accessible" problem sizes for different scalings of running time, and for 
different increases in the computer speed. The first column reports the scaling of i as a function of 
71 for different algorithms; the second column is the size of problems that can be computed in some 
given maximum time, which is denoted by n^; the third column reports the value of obtained if 
the computer speed is increased by a factor 100; the last column corresponds to a factor of 10000. 
Notice that while polynomial algorithms have accessible sizes that increase by a constant factor, for 
exponential algorithms the increase is an additive constant. 

In order to do that, computation models are introduced, which define what can (and cannot) be 
done in a computation. The most celebrated example of computation model is the Turing machine 
[23j , which consists of the following: a tape, made of an unlimited number squares, each of which can 
contain a symbol s from some finite alphabet E; a head which reads the tape and can perform some 
action a on it, such as "write s in this empty square", "move right one square", "erase this square", 
"halt" et caetera; an internal state of the head, which is an clement qi of a finite set {qi, . . . ,qr}', 
finally, a computation rule, which associates to any pair {s,qi) a pair {a,qii), where s is the symbol 
on the square currently under the head and qi its internal state, depending on which, a is an action 
performed by the head and qi' is the new internal state of the head. 

The computation begins with some input written on the tape, and proceeds according to the 
computation rule, until the computation ends (i.e. the head halts). The result of the computation is 
what is written on the tape at the end. Different computation rules will compute different quantities, 
i.e. solve different problems. 

Notice that any decision problem can be expressed in such a way that the instance is a string 
written in the alphabet S and the output is yes or NO, and therefore can be addressed by a 
suitable Turing machine. For example, a graph can be represented by a string over the alphabet 
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, (, — , )} by specifying the number of vertices and then for each edge, the pair of 
vertices it connects, for example: 5(1 — 3)(1 — 5) (2 — 4) (2 — 5) (3 — 5). The decision problem is then 
equivalent to identifying which strings correspond to instances for which the answer is yes, that is 
to say whether the input string is or not an element of the subset of possible strings for which the 
answer is yes. Since subsets of possible strings are often called languages, decision problems are also 
referred to as languages, or as set recognition problems. 

There are many variants of the Turing machine, such as binary machines, working on the alphabet 
0,1; or multi-tape machines (which have a finite number of tapes and heads, and for which the 
computation rule specifies the joint action of all of them); or universal machines, for which the 
computation rule is provided as an input on the tape (which can always be done, since the rule can 
be represented as a string) . For most of them, it can be proved that they arc equivalent to a simple 
Turing machine, with an overhead on running time which is at most polynomial in the input size. 
Moreover, many other computation models, sometimes drastically different from the Turing machine, 
have been proved to be equivalent to it. It is a well established belief (but far from provable), going 
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under the name of Church- Turning thesis, that any computation which can physically be performed 
can be represented by a Turing machine. 

Another very important variant is the non- deterministic Turing machine, which is a Turing machine 
with a computation rule which is not single valued: the machine is able to "split" (creating an identical 
copy of itself) and perform different actions on different tapes. One can either interpret this as a 
Turing machine with an infinite number of heads and tapes and which can transfer an infinite amount 
of information from one tape to another, or as "the most lucky" Turing machine, which at each split 
only executes one of the possible actions prescribed by the computation rule, and such that it leads 
to the "best" answer for the problem. Such a computation model is not feasible in practice, but we 
shall see that is very important from the theoretical point of view. In the following, by polynomial 
time I shall always mean on a deterministic Turing machine, unless differently specified. 

Since the Turing machine is such a general paradigm for computations, it can be used to define 
complexity classes, i.e. classes of problems that have similar complexity. There are many different 
complexity classes that are relevant, but we shall focus on two of them: 

Deterministic Polynomial Time (P) The class P is defined as the class of all decision problems 
that can be solved in polynomial time by a deterministic (i.e. "normal") Turing machine. 

Non-deterministic Polynomial Time (NP) The class NP is defined as the class of all decision 
problems that can be solved in polynomial time by a non-deterministic Turing machine. 

Some comments are in order. First, notice that these class definitions do not refer to any specific 
algorithm: it is the fact that it is possible to solve them under certain conditions which matters, not 
that we are able to do it. Notably, no polynomial time algorithm is known for any NP problem, 
so the possibility to solve them in polynomial time on non-deterministic Turing machines is a mere 
definition. 

However, and this is the second point, it is a very meaningful definition: for most problems, it 
is clear whether a problem is in P, in NP, or in none of the two. For example, for fc-SAT an obvious 
algorithm is polynomial on a non-deterministic Turing machine: proceed in steps, and assign a variable 
at each step, splitting between the assignments true and false, then simplify the formula, and verify 
that there are no contradictions (i.e. clauses which cannot be satisfied); if this happens, halt the 
corresponding head; if some head achieves to assign all the variables, then it has find a satisfying 
assignment and the answer is SAT; on the contrary, if all the heads halt before they have assigned all 
the variables, there is no satisfying assignment and the answer is unsat. This procedure is obviously 
polynomial, so fc-SAT is in NP. On the other hand, we have seen that the Gauss elimination procedure 
is polynomial (on a normal computer, and therefore on a Turing machine as well), and so fc-xORSAT 
is in P. 

Third, notice that any problem which is in P is also, a fortiori, in NP. In fact, the question of 
whether P and NP are equal (i.e. if there exist polynomial time algorithms to solve any NP problem) 
is one of the central open problems in complexity theory. It is strongly believed that the answer is 
no, but no proof (or disproof) of this is known. 

Fourth, an equivalent, and more "practical" definition of NP is the following: NP is the class 
of all problems for which it is possible to issue a certificate in (deterministic) polynomial time. A 
certificate is the the answer yes or NO for a specific configuration, provided as input together with 
the instance of the problem. In other words, NP problems are such that a candidate solution can be 
verified in polynomial time. Again, it is obvious that fc-SAT is in NP, and that any problem in P is 
also in NP. The equivalence of the two definitions is easy to verify: if a certificate to a problem can 
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be issued in polynomial time, a non-deterministic Turing machine can test in parallel all the possible 
configurations and find if some of them has answer yes. On the other hand, if a non-deterministic 
Turing machine can solve in polynomial time a problem, it can also check if any of the configurations 
for which the answer is yes coincides with the configuration submitted for the certificate. 

Finally, notice that these definitions, given for decision problems, actually extend to search and 
optimization problems, so that if a decision problem belongs to NP (or P), then all of them are in the 
same class. For example, the optimization problem of fc-SAT consists in finding the smallest value of 
E such that the decision problem "An assignment which satisfies M — E clauses exists" gives answer 
YES. One can solve in (non-dctcrministic) polynomial time for = 0, then for E = \ and so on, 
and find in (non-dctcrministic) polynomial time the smallest E. However, the complexity classes of 
enumeration problems arc often different. 

2.3.3 Reductions, hardness and completeness 

A reduction is a polynomial time algorithm which maps an instance of some decision problem into an 
instance of some other decision problem, such that the two instances always have the same answer. 
More formally, let us consider two decision problems A and B. Recall that A (and also B) can be 
viewed as the subset of the strings over the alphabet {0, 1} which describe the instances of the problem 
that give answer YES. Then, we can write x & A to mean that the string x represents an instance of 
problem A for which the answer is yes, denote by |a:| the length of the string x, and define functions 
that associate a string to another string, i.e. / : {0, 1}* — > {0, 1}* (the superscript * denotes the set 
of all the possible strings in the alphabet). A formal definition of reduction is the following: 

Reduction A decision problem A reduces to the decision problem i?, denoted by A <p B, if there 
exists a function / : {0,1}* {0,1}*, computable in polynomial time p{\x\), such that x G 
A ^ f{x) e B. 

Notice that since the function is computable in time bounded by p(|a;|), we must have 

\fix)\<p{\x\). (2.13) 

The concept of reduction is very powerful, since it permits to relate the complexity of different 
problems. In particular, one can define problems that are "at least as difficult" as any problem in 
some class: 

Hardness A decision problem A is C-hard for some computational complexity class C if for any 
problem B € C, B < A. 

Completeness A decision problem A is C- complete for some computational complexity class C if 
A e C and for any problem B e C, B < A. 

Loosely speaking, C-complete problems are the most difficult problems to solve in class C, and if an 
efficient (i.e. polynomial) algorithm is found for a C-complete algorithm, it can solve efficiently any 
problem in C (for which a reduction is known). 

The importance of fc-SAT in complexity theory is due to the following 

Cook-Levin Theorem 3-SAT is NP-complete [Ml Hi] • 

This was the first result on NP-complcteness, introduced the concept, and proved that several other 
problems, to which SAT can be reduced, where also NP-complete. 
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The proof of the Cook-Levin theorem is surprisingly simple, and emphasizes the (conceptual) 
importance of the non-deterministic Turing machine: it is simply a mapping of the time evolution 
of the Turing machine into a SAT formula, in which the interpretation of boolean variables is "The 
cell i contains the symbol j at time k in the computation" , or "The head is over cell i at step k 
in the computation", and "The head is in state qi at the step k of the computation" (where i,j,k 
act as variable indices). The proof shows how to form a legitimate SAT formula for any given non- 
deterministic Turing machine, and then that any SAT formula can be reduced to 3-SAT. 

fc-SAT proves a very powerful tool for reductions, because of its generality and simple structure. 
The following problems are easily proven NP-complete, by reducing fc-SAT to them: Integer Linear 
Programming, Hamiltonian Cycle, Traveling Salesman, Max Clique, Max Cut, Vertex Covering, 3- 
Coloring, .... The list is very, very long. 

The fact that so many important problems are in NP, and that no efficient algorithms are known 
(and probably exist) to solve them, seems very discouraging in view of applications. However, this 
need not be the case, as I shall point out in the following Paragraph. 

2.3.4 Other measures of complexity 

The complexity classes P and NP are defined in terms of the asymptotic behavior of the running time 
for the worst possible instance of any size n. In many practical problems, one can be satisfied if some 
much less stringent requirements are met: 

• If the typical running time over some distribution of instances is polynomial. 

• If an approximate optimal solution can be found in polynomial time for any approximation 
factor e. 

Average-case complexity theory studies the first question; the theory of complexity of approximation 
studies the second. 

Many average-case complexity results analyze the average time that some given algorithm takes 
to solve an instance of a problem, for a given distribution of instances. It is often the case that a 
NP problem is solved in polynomial time on average over some "natural" distribution of instances. 
For example, for problems defined on graphs, one can form the uniform distribution over all graphs 
containing n vertices and with some average connectivity. Then, one can prove that 3-COL can be 
solved in linear time on average. Often, however, all the algorithms known for some NP problem take 
exponential time on average. Alternatively, one can study the probability with which an algorithm 
finds an answer in polynomial time. 

A crucial point in average-case complexity theory is the choice of the distribution. For example, 
the best known algorithm for Subset Sum take exponential time if the n numbers in the set are taken 
uniformly in the range [1,2"]. However, if this range is extended to [l,2"i°s "], the average time for 
the best algorithm becomes polynomial. Even in cases when the dependency on the distribution is 
less dramatic, it remains a crucial point. For example, the reductions that map many NP problems 
on fc-SAT introduce a very peculiar structure in the fc-SAT formulae they generate, so that even though 
the distribution of the instances of the original problem is a natural one, the distribution of fc-SAT 
forniulaj that arc obtained is almost never a natural one. Thus, even though fc-SAT can be solved 
efficiently on average in many cases under natural distributions, these results do not extend to the 
problems that can be reduced to fc-SAT. On the other hand, even when it is possible to characterize 
the distribution of fc-SAT formulae generated by some reduction, it is usually either impossible to find 
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an algorithm that is efficient on average on them, or even to perform the analysis of the average case. 
This poses a severe limitation to the applicability of average-case complexity results. 

On the other hand, complexity of approximation results are very interesting in view of applications. 
They are, however, usually more technical than the results I have discussed, and beyond the level of 
this introduction. I shall only cite the Probabilistically Checkable Proof (PCP) theorem and its 
consequences on the approximability of max-3-SAT which is the optimization problem of 3-CNF 
formulae. 

One of the two equivalent definitions of the class NP requires that NP problems can be certified 
in polynomial time. The following definition extends the same concept: 

Probabilistically Checkable Proof (PCP) Given two functions r, g : N ^ N, a problem L belongs 
to the class PCP(r, q) if there is a polynomial time probabilistic function (called verifier) V : 
{0, 1}* {0, 1} which, given as an input: a string x; a string tt (called proof)] a sequence of 
r(|a;|) random bits; and which uses a substring of tt, of size (7(|a;|) and chosen at random, to 
compute V'^{x)^ and is such that 



In this definition, the proof tt is the analogous of the candidate configuration in NP: it is some string 
which is provided as an input, and which, if well chosen, can prove that x € L (i.e. that the answer 
to the instance represented by x of the decision problem L is yes). The verifier V{x) is the analogous 
of the algorithm which issues the certificate, i.e. it gives, in polynomial time, an answer which is 
YES or NO and which is related to the answer to the instance represented by x. However, V{x) is 
probabilistic, that is to say, it is a random variable. The source of the randomness is provided by 
the r{\x\) random bits used to compute V{x). For the problems in PCP(r, g), the distribution of the 
values of V{x) verifies the condition (j2.14p . Finally, notice that only a number (j(|a;|) of symbols in tt 
is actually used in the computation of V^(x), and these symbols are chosen at random. 

At first sight, the class PCP seems very unnatural, and of little interest. The following theorem 
proves this impression very much wrong: 

PCP Theorem NP = PCP(0(log n), 0(1)) . 

Again, several remarks. First, notice that any mathematical statement can represented by a string, 
and that any mathematical proof can be represented by another string. Mathematical statements can 
be divided in two: right ones (i.e. theorems), and wrong ones. One can consider the following decision 
problem, called theorem: given a mathematical statement, is it a theorem? It is clear enough that 
it is possible to verify if a proof provided to support a statement is correct or not in a time which is 
polynomial in the length of the proof. Therefore, theorem is in NP. 

What this theorem states is that any theorem represented by a string x can be recognized by 
looking at a finite number of randomly chosen bits of some suitable proof, represented by a string 
TT, and evaluating some polynomial time function V. Then, if V^(x) = the statement is not a 
theorem with probability 1. while if V^{x) = 1 it may or may not be. Conversely, if the statement 
X is a theorem, then there must be some proof tt such that (x) = 1 with probability 1, and if the 
statement is not a theorem, the probability that V'^{x) = 1 is less than or equal to 1/2 for any proof 
string TT. One can therefore check if the proof of any theorem (of any length) is correct just by looking 
at a finite number of bits in the proof, provided it is put in a suitable form, and obtain a probabilistic 




(2.14) 
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result which is correct with probabihty 1 if the answer is NO, and correct with probabihty p if the 
answer is YES, for any p <\. 

Second, the same reasoning apphes to any NP decision problem, not just theorem. For example, 
if. instead of providing a candidate solution to check if an instance of fc-SAT is satisfiablc, one provided 
a PCP proof TT, then it would be possible to check it in constant time, rather than polynomial, 
obtaining a probabilistic result. 

Third, even though the PCP theorem is very surprising in itself, the following corollary is also 
remarkable: 

Hardness of approximation of MAX-3-SAT The PCP Theorem implies that there exists e > 
such that (1 — e)-approximation of MAX-3-SAT is NP-hard. 

In other words, it is at least as difficult to find an approximation to the optimal assignment as it is to 
find the optimal assignment itself (if the approximation has to be good enough). 

The theory of complexity of approximation is very rich and well established. However, I shall not 
discuss it any further. 

2.3.5 Connections to the work presented in Part II 

In Chapter m I shall present some results about what a certain class of algorithms can and cannot do 
on average for fc-xORSAT, and also for an extension of fc-xORSAT which is NP-complete. 

The motivation for the work in presented in Chapter [5] is a recent result which establishes a 
relation between the average-case complexity for 3-SAT on the uniform distribution, and the worst-case 
complexity of approximation for several problems. The results I shall present provide an indication 
that some hypothesis, on which the previous relation is based, might be wrong. 

2.4 Search algorithms 

In the previous Section, I have introduced the concept of computational complexity, which measures 
how difficult it is to solve a problem. In this Section, I shall introduce several algorithms that attempt 
to do it in practice for the search problems associated to fc-SAT and fc-xORSAT, that is to say algorithms 
which try to find satisfying assignments for a given formula. 

There is a huge variety of approaches and "strategies" to solve combinatorial optimization prob- 
lems, and notably fc-SAT. It is important to notice that, due to their formal similarity, the vast majority 
of the algorithms that can solve fc-SAT can also solve fc-xORSAT and vice versa, although with different 
performances (sometimes dramatically). I shall therefore discuss the two problems jointly, specifying 
the cases in which there are notable differences. 

This introduction will be far from exhaustive: I shall focus on those algorithms of interest in view 
of the discussion of Part II. They can be divided in broadly two families: 

Random- walks In random walks, all the variables are assigned at the first step of execution, typically 
at random, or following some more refined rule. In the following steps, single variables or groups 
of variables are selected and "flipped" (i.e. their value is changed), according to some stochastic 
rule which depends on the configuration. The algorithm stops when a solution is found, or when 
an upper bound to the number of steps has been reached. An algorithm in this family is specified 
by the rule according to which variables are flipped. 
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DPLL Procedure In the DPLL procedure, variables are assigned sequentially: at each step, a 
variable is selected according to some heuristic rule, and its value set according to some strategy. 
Once a variable is assigned, the formula is simplified by replacing it with its value. Under this 
process, the formula therefore evolves into a shorter and mixed one (i.e. including clauses of 
different lengths). Two events are especially important in the DPLL procedure: the generation 
of unit clauses and of contradictions. An algorithm in the DPLL family is specified by these 
four characteristics: the heuristic, the strategy, the action taken in the presence of unit clauses, 
and that in presence of contradictions. 

The rest of this Section is organized in two Paragraphs, one for each family of algorithms. In 
each case, I shall consider the average case performance over the uniform distribution of instances, for 
either fc-SAT or fc-xORSAT. 



2.4.1 Random- walk algorithms 

The most familiar random-walk algorithm for physicists is the Metropolis Monte-Carlo procedure, 
which is capable of sampling configuration with probability equal to their Gibbs weight. In particular, 
the zero temperature version of the Metropolis algorithm consists in picking at each step a variable at 
random and flipping it if this decreases the number of violated clauses, and is a very simple example of 
"greedy" algorithm, i.e. an algorithm which tries to perform a local optimization of the configuration 
at every move. 

Based on the qualitative arguments about frustration presented in Paragraph 11.2.41 such a local 
optimization procedure is bound to fail in disordered systems. The following arguments shows that 
this is the case with probability 1 for uniformly drawn random instances of 3-xORSAT. Consider the 
subformula represented in Figure 12. H which I shall call a "blocked island" . It is clear that if such a 
subformula is present in the formula, and if it is found in a configuration such as one of those depicted 
in the figure, a greedy algorithm will not be able to reach a satisfying assignment. In |27| it is shown 
that in the limit N oo this situation occurs with finite probability 

p ^ J^a7g-45a (2.15) 

^ 1024 ^ ' 

where a is the clause to variable ratio, a = M/N. The average number of blocked islands in a random 
3-xORSAT formula is pN ~ 0{N), and it is a lower bound to the minimum number of violated clauses 
of configurations that greedy algorithms are able to find. 

More interesting are "less greedy" algorithms. A simple example is provided by Pure Random 
Walk Sat (PRWalkSAx), which was introduced in [28], and is defined as follows: initially, assign all 
the variables uniformly at random; then, at each step pick uniformly at random a clause among those 
that are violated, and a variable among those appearing in it, and flip it; repeat, until a satisfying 
assignment is found, or a number T^ax of steps has been performed. Notice that by flipping a variable 
which appears in a violated clause, that clause becomes satisfied; however, if that variable also appear 
in other clauses that were satisfied before the fiip, they might become unsatisfied after. This is why 
this algorithm is "less" greedy. The possible outcomes of the algorithm are two: either a satisfying 
assignment is produced, or the output is undetermined. 

In [2S] , it was shown that PRWalkSAT finds a solution with probability 1 for any satisfiablc instance 
of 2-SAT in a number of steps (i.e. time) of order 0{N^). An interesting extension of this result to 
3-SAT was obtained in [29], where it is shown that if Tmax = 37V and the procedure is repeated for a 
number R of times without obtaining a satisfying assignment, then the probability that the instance is 
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Figure 2.1: Representation of a "blocked island". Each dot in the diagram corresponds to a variable, 
and triangles represent 3-clauses containing the variables at the vertices. The left-most diagram shows 
an isolated subformula; the variables in the subformula are all assigned, in such a way that the clauses 
marked with the letter S are satisfied, those with the letter U are unsatisfied. If one of the variables 
appearing in the central clause is fiipped, the second configuration is obtained; if one of the variables 
which do not appear in the central clause is flipped, the third configuration is obtained. In both cases, 
the number of unsatisfied clauses increases by 1 (From [17]). 

SAT is upper-bounded by exp[— i? (3/4)^]. By taking R sufficiently larger than (4/3)^, the probability 
that an instance for which no satisfying assignment has been found is nonetheless satisfiable can be 
made arbitrarily small. Also, notice that, even though the running time of such a procedure (for 
any fixed probability bound) is exponential, it is still exponentially smaller than 2^, which would be 
required by exhaustive search. 

The previous results hold for any instance, and the probabilities mentioned are over the choices of 
the algorithm. Another interesting question is to analyze the average-case behavior over the uniform 
distribution of fc-SAT instances. This was done in [33 Ell [32] ■ In the first of these papers, a rigorous 
bound is found for the values of the clause-to- variable ratio a = M/N for which PRWalkSAT finds 
a solution in polynomial time with probability \: a < apRWaikSAT — 1-63 (for fc = 3). This is the 
first example I mention of an algorithmic bound on a, i.e. a threshold value separating two different 
behaviors of the same algorithm. Many more will follow. Also, notice that since with probability 1 
PRWalkSAT finds a solution for random 3-SAT formulae with a < apRWaikSAT, this implies that these 
formulae are satisfiable with probability 1. 

In [311 132j the same problem was studied with "physical" methods. In particular, a numerical 
study indicates that random instances are solvable with probability 1 in polynomial time if a < 2.7, 
while for larger values the time becomes exponential. The analysis of the master equation performed 
in [31] shows that the average fraction of unsatisfied clauses, (p{t), after tN steps of the algorithm, is 
a deterministic function which depends on a and goes to in finite t if a < 2.7, while for larger a it 
tends asymptotically to a finite value, which is for a ~ 2.7 and then increases. In this second regime, 
it can happen that solutions arc found because of fluctuations, but the time which this requires is 
exponential in N. 

A somewhat more complicated variant of this algorithm goes under the name of WalkSAT, and is 
defined as follows: 
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procedure WalkSAT(p, r,„ax) 

Assign uniformly at random each variable 
repeat 

Select uniformly at random a clause C which is unsat 

For each variable Xi in C, compute the break-count b{xi), defined as the number of clauses 
currently satisfied that will be violated if Xi is fiipped 

if A variable Xj in C has break-count h{xj) = then 

Flip x-j 
else With probability p : 

Select the variable in C with the lowest break-count (or select uniformly at random one 
of the variables with the lowest break-count, if there are more than one), and flip it 
else With probability 1 — p : 

Select uniformly at random a variable in C and flip it 
end if 

until There are no UNSAT clauses, or the number of steps exceeds Tmax 
if A solution X has been found then return X 
else return undecided 
end if 
end procedure 

As in the case of PRWalkSAT, variables to be flipped are selected only in clauses that are currently 
UNSAT. However, instead of picking a variable at random, WalkSAT looks for a variable which can be 
flipped without making any clause UNSAT which is currently SAT. Notice that in doing this the total 
number of unsat clauses must decrease of at least 1 (i.e. the selected clause becoming sat). On the 
other hand, if some clauses currently SAT have to become unsat, the variable which minimizes their 
number is selected, with probability p, or otherwise any variable in the clause uniformly at random. 
In both of these cases, the total number of unsat clauses can increase. 

The average case performance of WalkSAT is astonishingly good. Numerical experiments suggest 
that its typical running time (e.g. the median over a series of runs) remains linear for a up to 4.15 
(for k = 3) [33]. Interestingly, this value coincides with the threshold for the stability of the IRSB 
solution [M] . 

For larger values of a, the behavior of WalkSAT becomes more complicated. The average running 
time becomes exponential, with a peculiar structure in the average fraction of unsatisfied clauses as a 
function of the number of steps (divided by A^). A detailed analysis of this behavior is the object of 
current work in collaboration with Giorgio Parisi. 

2.4.2 DPLL algorithms 

The DPLL procedure is a firmly established complete algorithm for fc-SAT and similar constraint 
satisfaction problems. For concreteness, and for future reference in Chapter HI I shall consider the 
case of fc-xORSAT. DPLL was introduced by Davis and Putnamm in 1960 [35] and developed by Davis, 
Logemann and Loveland in 1962 [36j . and has many variants. 

The basic principle is to assign the variables in sequential order, and simplify the formula after 
each assignment. This generates a sub- formula in which clauses that are satisfied are eliminated, 
and clauses in which the assigned variable appears decrease in length of one unit. If a unit clause is 
generated (i.e. a clause of length 1), this clause determines the value of the variable appearing in it, 
and it is assigned accordingly This event is called Unit Propagation (UP). The rule according to which 
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the variable to be assigned is selected is called heuristic. Most often, the value assigned is selected 
uniformly at random, but sometimes a rule, called strategy, determines it. The simplest example of 
heuristic consists in selecting the variable uniformly at random among those not yet assigned, as well 
as the value, but giving priority to UP; it is called Unit Clause (UC). 

A crucial distinction between DPLL variants is the action taken if a contradiction arises, i.e. in 
the case of fc-xORSAT, a pair of unit clauses for the same variable with conflicting assignments. If this 
occurs, no value of the variable in question will satisfy the subformula, and therefore the original one. 
This event signals that some of the previous assignments were wrong. Two possible actions can then 
be taken: either modify some of the previous assignments, or output undetermined and possibly 
restart the procedure. In the first case, the algorithm backtracks to the last variable which was set by 
a "free" step (as opposed to a UP or a backtrack), and inverts it. In the second case, the algorithm 
is no longer complete, but we shall see that it can still be interesting in the average case. 

Formally, wc can describe the DPLL procedure with and without backtracking with the two fol- 
lowing procedures, in which J- is the formula and H is the heuristic, i.e. a function which associates 
an index of a variable not yet assigned to a subformula. With no backtracking, 
procedure DPLL without backtracking (J?^, H) 
repeat 

for every unit clause U in T do 
Simplify 

i ^ H{T) 

T ^ Simplify[^, = S{T)] 
if a contradiction is present then 
return undetermined 
until all the variables are assigned 
return true 
end procedure 

where S{J-) is the strategy according to which values for assignments are decided. With backtracking 
the procedure is somewhat more complicated, and it is more conveniently expressed in a recursive 
form: 

procedure DPLL with backtracking(J^, H) 
if all the the clauses are satisfied then 

return true 
if a contradiction is present then 

return false 
for every unit clause U in T do 
Simplify 

i ^ H{T) 

return DPLL [Simplify (J^, = TRUE),i7] V DPLL [Simplify (J^, a;^ = false), i7] 
end procedure 

The complete variant of DPLL (i.e. the one with backtracking) has been extensively studied (see 
for example [23 EH] and references therein). In the following, I shall concentrate on DPLL without 
backtracking. 

Many different heuristics for DPLL have been studied, in view of both theoretical studies and 
applications. In the following, an important role will be played by the Generalized Unit Clause (GUC), 
introduced and studied in [3S1[1D1I1T], which is defined as follows: at each step, select uniformly at 
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random a clause among those of shortest length, and then uniformly at random a variable in it. This 
generalizes the UP rule to clauses of length larger than unit, hence the name. 

The analysis of the average case behavior of DPLL heuristics can be simplified considerably using 
the following approach, introduced in [42]. Consider the state of the formula after T variables have 
been set. It will contain a number Cj (T) of clauses of length j = 1,2, . . . , k (for some values of T, some 
unit clauses will not have been removed yet, hence the term j = 1). The formula can be described as a 
table in which each row represents a clause, and each "slot" in it represents a variable. Initially, there 
are M rows, each of length k, which then become shorter as the algorithm proceeds. If the heuristics 
we consider consist in the selection of either a variable uniformly at random, or of a slot in the table 
according to some rule which docs not depend on the content of the slots, then the subformula: that 
are generated are uniformly random conditioned on their lengths. This is the case of both UC (which 
always selects the variable uniformly at random) and of GUC (which selects, uniformly at random, 
first a row in the table among those of shortest length, and then a slot in the row). 

In the case of UC, at each step a variable is selected uniformly at random. Because of the statistical 
independence of the subformulae, each slot has a probability 1/(A^ — T) of containing the selected 
variable, and a clause of length j has probability j/{N — T) of containing it. Since the clauses of 
length J that contain the selected variable become of length j — 1, the average variation in the number 
of clauses is 

E[c,(r + 1) - Q(r)|{Q(T)}] = iJ + m+iiTh-jQ{T) (2^^g) 

where, for notational simplicity, we set Ck+i{T) = 0. Notice that this is the same equation one 
obtains for steps in which UP is applied, when instead of selecting the variable uniformly at random 
it is selected among those appearing in unit clauses. 

A theorem by Wormald [15], the statement of which is rather technical and I shall omit, ensures 
that (under some very general assumptions which are satisfied by all the heuristics we shall consider) 
the clause densities are concentrated in the thermodynamic limit, 

E[C,{T)]^Nc,iT/N) (2.17) 

where Cj(t) is a function determined by the differential equation obtained dividing (|2.16p by AT = 1. 

dc,{t) _ E[AC,{T)\{C,{T)}] _ {j + l)c,+^{t)-jc,{t) 

dt N^oo AT 1-t u I ■ J 

Since the initial formula contains M = aN clauses of length k, the initial condition for this system 
of equations is Cj{0) = Sj^kCt. Notice that, if at any time, ci{t) > 0, i.e. the formula contains an 
extensive number of unit clauses, each of them has a probability of order of containing any given 
variable, so that there is a finite probability that two unit clauses will contain the same variable. If 
this happens, a contradiction is generated with finite probability at each step of the algorithm, so that 
over a finite interval of time At this will happen with probability 1. Therefore, if at any time during 
the evolution of the formula ci{t) becomes positive, the algorithm will generate a contradiction and 
will stop. This is the reason why the range of values of j starts with 2. Since the rate at which unit 
clauses are generated is 

f§ (2.19) 

and the rate at which they are removed is at most 1 (because one variable is set at each time step, 
and therefore at most one unit clause is removed), the condition for the onset of contradictions is 

^ = 1. (2.20) 
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The system of equations (|2.18p is easily solved: 



c,{t)^a(''){l-tyt'-^ {j^2,...,k). (2.21) 



The algorithm will provide a solution with probability 1 if all the variables are set without generating 
contradictions, i.e. if 2c2(i)/(l — t) < 1 for all t e [0, 1]. For C2{t) given by (|2.2ip . this function reaches 
a maximum for t — t* = {k — 2)/{k — 1), in which its value is 

max = ak (2.22) 

te[o,i] 1-t \k-lj 

which is equal to 1 if 

Notice that this implies that for a < ah*^, random fc-xORSAT formulae from the uniform distribution 
are satisfiable with probability 1, and UC is capable in finding a satisfactory assignment in linear time 
with probability 1. 

A similar analysis can be performed for GUC. Initially, the formula contains M clauses of length 
k. As variables are set, some clauses become shorter: let us suppose that after T steps the number of 
clauses of length j is Cj{T) for j = j fc with j* > 1, and for j < j* , and let us consider what 
happens starting from there. When a variable is set by GUC, it is selected among the shortest clauses, 
i.e. those of length j*. A clause of length j* — 1 is generated, and the other numbers of clauses vary 
only if the same variable appears in other equations. That is to say, after the first variable has been 
set the average variations in Cj{T) are: 



(aWQ(T)) ^ E[C,{T+1)-Q{T)\{Q{T)}] 

(j + i)c,+i(r)-jQ(r) 



N -T 



{j=f + l,...,k), (2.24) 



(aWQ.(T)) EE E[Q.(T+l)-Q.(T)|{Q(r)}] 

(r + i)Q.+i(r)-.fQ>(T) 

N-T 

aWq._i(t)) EE E[Q-„i(r + i)|{Q(r)}] 



(2.25) 



= i + ll^MEl, (2.26) 

N-T ^ ' 

where the superscript (n) indicates that n variables have been set (here, n = 1). 

Notice that the average number of clauses of length j* — 1 is now of order 0(1), and not smaller 
than 1. GUC will then select a clause from one of the clauses of length j* — 1, giving: 

(A(2)Q(r)) = 2^2±3^2^0_j£2^ + o{N~^) (j=r + l,...,fc), (2.27) 

(a(2)q.(T)) = -1 + 2 + + 0{N-') , (2.28) 

(a(2)q._i(T)) = 2^^Ml + o(Ar-i), (2.29) 

A(2)Cj._2(r)^ = 1 + 0{N-^). (2.30) 



In this equations, the terms 0{N ^) come from the fact that we are considering the initial T for 
evaluating the functions, which results in a variation of 0(1) in the values of the Cj. Notice that UP 
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do not contribute to values of j that are smaller than j* — 1, because the number of clauses of such 
lengths are not extensive. 

It will then take (on average) j* — 1 steps (after the first one) to "empty" one of the clauses of 
length j* — 1 that have been generated: 

(a(^*)Q(T)) = + +0(^-1) + (2.31) 

(A(^*)Q.(r)) = -1 + .r + ^^^'"'"^^^l' + 0{N-') , (2.32) 

(a(^*)C,._i(T)) = fll^^M+OiN-'), (2.33) 

(Ja(^*)Cj-_2(T)^ = O(iV-i). (2.34) 

Let us call a round the sequence of steps starting with the assignment of a variable in a clause of 
length j* — 1 and ending when there are no more clauses shorter that j* — 1, such as the steps from 
2 to j* in the previous argument. Each round has the same duration: j* — 1 steps. During such a 
round, the variation of the average number of clauses of length j* — 1 is 

(A(™™d)Q._i(T)) = -1 + if - 1) ^*^^^^^ + 0{N-^) , (2.35) 
so that after r > 1 rounds the average variations will be 

(A(i+^(r-i))c,(T)) = [1 + r(f 1)] + ^^^^'"^^^Ir '^'^^^ + OiN-^) U=f + h---,k), 

(2.36) 

^A(i+'-(j"*-i»Q.(T)) = -1 + [1 + r(.f - 1)] + ^^^''y + OiN~') , (2.37) 

^A(i+^(r-i))c^_^(T)) = I + r^L^ + r\^-l + if -1)^2^11^'^ +0{N-'). (2.38) 

There are two possible cases: either after a finite average number R of rounds the average number 
of clauses of length j* — 1 returns to 0, or not. In the first case, R is obtained from the condition: 

(Ja(i+^(^'-i»Cj._i(T)^ =0 (2.39) 
^ R^ +0{N-^). (2.40) 

Notice that, since R is an average number, it needs not be integer, and also that the condition for R 
to be finite is 

r^^^. (2.41) 

N-T ^ ^ 

After R rounds, the number of steps that have been taken is 

AT = 1 + i? X (r - 1) = ^\,,c.,(T) + OiN-') (2.42) 

and the total average variations will be: 

(a(a^)c,(t)) = Ar(^±ll^^±ii^l^^^ + o(iv-i) (j=.f + i,...,fc), 

(2.43) 

(a(^^)Q.(T)) = -1 + Ar ^-^'* + ^^^''"^^-^1' + 0{N'') , (2.44) 

^A(i+''^(J*-i))Cj._i(T)^ = O(iV-i). (2.45) 
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Wormald's theorem can be applied, ensuring that in the thermodynamic hmit the contributions of 
order 0{N~^) are ininfluential, and that the average densities are concentrated around the functions 
Cj{t) that are solutions of the differential equations obtained by dividing ()2.45p by AT, given by ()2.42p . 
The equations wc obtain arc the following: 



dt 
dt 



if 



1-t 

l)cj-+i 



(i = .r + i,...,/c), 



1 - 1 



1 - if - 1 



i-t 



which we can rewrite as a single equation 



dt 



1 - t 



if - 



i~t 



(j = r,...,fc) 



(2.46) 
(2.47) 

(2.48) 



We still have to analyze what happens when R diverges. In that case, the rate at which clauses 
of length j* — 1 accumulate is larger than the rate at which they can be removed, and their number 
becomes extensive. This signals that the value of j* must decrease by one unit. 

In Paragraph 14.3.31 1 shall give a detailed study of the solution to these equations for fc = 3, 
showing that GUC finds solutions in linear time with probability 1 for random formula; from the 
uniform distribution for a < a^^'~^{3) ~ 0.750874, which is therefore a lower bound for the value up 
to which random formulae are SAT with probability 1. 
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Chapter 3 



Phase transitions in random 



optimization problems 



In the previous Chapter I have introduced two random optimization problems, k-SAT and fc-xORSAT, 
which arc equivalent to some spin glass models. In this chapter I am going to review the rich phe- 
nomenology displayed by these models, consisting of several phase transitions regarding different order 
parameters. 

I shall first make a very brief introduction to the discovery of sharp transitions in numerical 
experiments, mostly concerning fc-SAT, in Section 13.11 then, I shall give a rigorous derivation the 
phase diagram of fc-xORSAT in Section [312 finally, in Section [?751 1 shall sketch the main results on the 
phase diagram of fc-SAT. 

3.1 Evidence of phase transitions from numerical experiments 

Phase transitions are a common and well understood concept in statistical mechanics. In the context 
of random combinatorial optimization problems, it is far less obvious what this can mean. I shall 
therefore start with a definition and a simple example. 

Let us consider a random problem defined over some distribution of instances, and a property V 
which might be true or false for each instance. I shall denote by N the size of the problem, by c some 
control parameter, and by P{N, c) the probability over the distribution of instances that V is true. 
Then, a sharp transition in V is defined by the following condition: 



where c* is a constant threshold independent on N. 

For example, we might consider random graphs with N vertices and M = cN edges, and ask what 
is the probability P{N, c) that the largest connected component in the graph has size 7iV with 7 > 
and independent on N. This problem, called random graph percolation, has been studied by Erdos 
and Renyi in |491 150j . They have proved that the percolation indeed undergoes a sharp transition, 
with threshold value c* = 1/2. 

In numerical studies the definition (|3.ip is of little use, as the size of samples has to be finite. Some 
method to extrapolate results to the N ^ 00 Hmit is needed. For large but finite N, P{N, c) will be 




(3.1) 



43 



44 



CHAPTER 3. PHASE TRANSITIONS IN RANDOM OPTIMIZATION PROBLEMS 



a smooth function of c varying from to 1, whose form will in general depend on N. The transition 
region, defined as the range of values of c in which e < P{N, c) < 1 — e for some finite e independent 
on TV, will have a width A(iV) which will become smaller and smaller as N grows. If A(iV) scales as 
a power of N, A(iV) ~ iV"" for some constant one can rcscale 



and hope that the function (j)N{') becomes independent of for large (but experimentally accessible) 
A^. If this is the case, the values of v and c* can be obtained by fitting numerical data so that they 
"collapse" on 4>{c). This is one of the simplest applications of a general method which goes under the 
name of finite size scaling (sec for example j51|). In the case of percolation on random graphs, a finite 
size scaling of the type of (|3.2p holds, with ly = 1/3. 

Finite size scaling was applied in |52j to fc-SAT, providing the first numerical evidence for a sharp 
transition between a SAT phase where random formulae are satisfiable with probability 1 and a unsat 
phase there they are not satisfiable with probability 1. The threshold value as{k) was measured for 
k = 2,3,4,5,6, together with the exponent h'(k). For example, for fc = 3 the values found were 
Q!s(3) ~ 4.17 and i^(3) ~ 0.67. However, due to the relatively small size of the formulae considered 
(A^ « 100), these values were later proved to be inaccurate (most notably the exponents). 

Previous studies, for example [53j . had measured the probability of a random formula being sat- 
isfiable, pointing out that it was 1/2 for a ~ 4.25 for fc = 3 and N sufficiently large, but without 
discussing the N dependence of the transition width. In fact, the main purpose of that study was to 
analyze a different phenomenon: the variation of the running times of the complete DPLL procedure 
on random formulae as a function of a. What the authors had noticed, and motivated their work, 
was that formulae were "hardest" to solve in a region centered on the value of a corresponding to 
P[Sat|7V,a] = 1/2. 

This problem was analyzed again in [54], in which finite size scaling techniques were applied to 
the median running time as a function of N and a. Even though the maximum of the running time is 
reached for a ~ as{k) for large N, this is a very different phenomenon from the SAt/unsat transition, 
since it is related to the dynamical properties of an algorithm (while the SAt/unsat transition is a 
property of the ensemble of formulae themselves). 

These two problems, the phase transitions of random constraint satisfaction problems, and the 
dependency on a of the performance of algorithms, as well as their connection to the properties of 
typical random formulae and of their solutions, will be the main topic of the rest of this Chapter, in 
which I shall present some well known results, and of the second Part of this thesis, presenting some 
original ones. 

3.2 Rigorous derivation of the phase diagram of /c-xorsat 

In this section I shall present a some rigorous results on the phase diagram of fc-xORSAT. The cases 
A; = 1 and k = 2 are much simpler than the general case fc > 3. On the other hand, all values of fc > 3 
give rise to the same behavior (at least qualitatively), while the behavior for fc = 1 and 2 is different. 
For these reasons I shall restrict fc > 3 in this Chapter. 

As in the case of fc-SAT, it is intuitive to expect that as the ratio a ~ M/N between the number 
of clauses M and the number of variables N increases, the probability that a random formula be 
satisfiable will decrease. And numerical experiments confirm that (as was the case for fc-SAT) the 
transition between the SAT and the unsat phases becomes sharp as ^ oo. 




(3.2) 
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However we shall see that the phase diagram of fc-XORSAT presents a richer structure than just 
a SAt/unsat transition, and that the geometrical properties of the set of solutions in the SAT phase 
present a second phase transition, which can be related to the performance of search algorithm, as I 
shall discuss in Chapter ID 

3.2.1 Bounds from first and second moments 

In this paragraph I shall derive a rigorous bound for the threshold value as(fc) of the SAt/unsat 
transition, first proved in [55] . 

The number of solutions M of fc-xORSAT formulas with fixed M and can be regarded as a random 
variable whose distribution V{Af) will depend on the distribution of the formulae considered. Since 
this random variable only takes integer values, the following identity must hold: 

Af=0 A/'=l 

which means that the probability of having at least a solution is smaller than or equal to the average 
number of solutions. This bound for the probability that a formula is satisfiable is called first moment 
inequality. 

Let us denote by X = {xi \ i = 1, . . . ,N} a configuration of N boolean variables. In order to 
compute the average number of solutions of a random formula drawn from the uniform distribution, 
we introduce the indicator function ei{X) which is equal to 1 if the configuration X verifies clause I 
and otherwise. Then: 

w=(Eri^'W) • (3.4) 

Since the clauses are extracted independently of one another, the average over the choices of the 
formula can be computed as an average over the choices of each clause appearing in it: 

M 

w=En(='W) ■ (3-5) 

X 1=1 

Moreover, the probability that any configuration X satisfies a uniformly drawn random clause is 1/2, 
since for any choice of the indices appearing in the clause (and therefore, for fixed X, for any left hand 
side of the clause), the two choices true and false for the right hand side have equal probability. 
We obtain the very simple result: 

(AA) = 2" X 2-^' = 2^(1-") . (3.6) 

and therefore from the first moment inequality: 

P[sat] < (A/") = 2^(1-") (3.7) 

which goes to zero for A^ ^ cxd if a > 1. 

A lower bound for P[sat] can be obtained from the second moment inequality, which is derived 
from the Cauchy-Schwarz inequality of the scalar product 

u-v = J2'PW'UAfm, (3.8) 

which ensures that 

(u-v)2 < (u-u) X (v-v) (3.9) 
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for any vector u and v. In particular, by choosing ~ Af for any Af and Vf^ = 1 for Af > 1 and 
Vo = one obtains: 



{MY 



\M>1 



< 



A^>0 



SAT 



(3.10) 



The crucial point is to compute 
/ / M 



M 



(3.11) 



X 1=1 



X,Y 1=1 



X.Y 



where again we made use of the independence of clauses in the extraction of a random formula to 
write the result in terms of {e{X)e{X)) which is the probability that hoth X and Y satisfy a random 
clause. This quantity will obviously depend on how different X and Y arc: if X satisfies the clause, 
Y will also satisfy it if and only if the number of variables appearing in the clause that are different 
in X and Y is even. When averaging over the choice of the clause, this will depend on the Hamming 
distance d{X, Y) between X and Y, 



(3.12) 



For example, for k 
clause is 



3 the probability that two configurations at distance d satisfy a random 



(3.13) 



where the factor 1/2 comes from the probability that X satisfies the clause to begin with; the term 
(1 — d)^ is the probability that the 3 variables appearing in the clause take the same value in X and Y; 
the term 3d^{l — d) is the probability that two variables are different and one is equal (among those 
appearing in the clause) in X and Y] and finally, we are neglecting a term of order N^^ arising from 
the correlations in the choices of the variables appearing in a single clause (which must be different). 
The general form will be 



Pk{d) 



E 



1=0.2,. 



d\l - df-^ + 0{N-^) 



(3.14) 



in which only the even terms in the binomial expansion are taken. Notice that (contrary to the upper 
bound obtained from the first moment inequality), the lower bound derived from the second moment 
inequality will therefore depend on k. 
Going back to (|3.11|) we obtain: 



(A^2) = ^Pfe(d(x,y)) 

X,Y 



M 



E 

d = 0,1/N,2/N, 



PkWMid) 



\Nd 



where A4{d) is the number of pairs of configurations at distance d, i.e. 

A4{d) = 2' 
so that for large N 

(Af^)^ E exp{iVlog2[l-(l-d)log2(l-d)-dlog2rf + alog2Pfc(d)]} 

d = 0,1/N,2/N, ■■■ 



(3.15) 



(3.16) 



(3.17) 
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which we evaluate with the Laplace method: 

(AP) = 2^'^'=("''*") (3.18) 

where 7fc(a, d) is the function multiplying A^log2 in (j3.17p and d is the value of d that maximizes it 
in the interval [0, 1]. The result of the second moment calculation is: 

P[SAT] >!^= \, , = 2^[2(i-")-7^-(".'^)] . (3.19) 

For fc = 3 one obtains that if a < ao(3) — 0.889 the function 73(0, d) has a global maximum in 
d = 1/2 where 73(0, 1/2) = 2(1 — a) + o(l) (the asymptotics are for N 00); if a > ao(3) a maximum 
located at d < 1/2 becomes larger than the local one at d = 1/2 and 73(0, d) > 2(1 — a) + o(l). 
Comparing with (|3.19p one sees that, in the limit N — > 00, P[sat] > if a < ao(3). 

The same analysis can be performed for larger values of k, leading to similar results. In fact, one 
can prove a stronger statement: not only P[sat] > if a < ao{k), but the lower bound is equal to 1 in 
the thermodynamic limit, so that random fc-xORSAT formulae are SAT with probability 1 if a < ao{k). 

The conclusion of the first and second moment calculations is that, if there is a sharp transition 
between the SAT and the unsat phases in fc-xORSAT, it must occur for a = as{k) such that 

aoik) < asik) < I . (3.20) 

Since these bounds are not tight, one cannot conclude whether such a sharp transition exists on the 
basis of the first and second moment inequalities. 



3.2.2 Leaf removal procedure 

The leaf removal procedure allows to prove that a sharp transition between the SAT and UNSAT phases 
indeed exists, to compute the value of as{k) at which it occurs, and to characterize the geometry of 
the solutions [5^157]. 

The main idea behind this powerful argument is the following: if the formula contains a variable xi 
which has a unique occurrence, the value of xi is constrained only by the clause in which it appears. 
Given the values of the other variables that appear in it, one is free to set the value of xi so as to 
satisfy the clause. This means that a clause which contains a single-occurrence variable does not 
constrain the values of the other variables that appear in it. One can then set it apart, and look 
for a solution of the reduced formula in which neither the single-occurrence variable nor the clause it 
belongs to are present. Moreover, when a clause is set apart, it is possible that some variable that 
appears in it becomes a single-occurrence variable (relative to the rest of the formula), so that the 
removal of single-occurrence variables (leaves) is an iterative procedure. In the following I shall give 
a quantitative description of this process. 

Let us consider a random fc-xORSAT formula with M clauses and N variables. It is easy to show 
that the distribution of the number of occurrences £ of the variables in the formula will be a poissonian 
with parameter ak: 




(3.21) 



A finite fraction afce""*"' of variables will therefore have a single occurrence. Let us assume that we 
proceed by removing them one at a time, in successive "steps" . 

I shall denote by niiT) the average number (divided by N) of variables that have £ occurrences 
after T steps. At that point, the total number of variables in the system is A'"' = N — T, and the 
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total number of clauses is M' ~ AI — T, since at each step one variable and one clause are removed 
from the system. During a step, the number of occurrences of the other variables that appear in the 
removed clause will also be decreased by one. What is the probability that one of these other variables 
has £ occurrences? One might be tempted to say that it is just proportional to Uf , since that is the 
probability that a variable has £ occurrences. However this is wrong, for the following reason. We 
can regard the formula as a table with M' rows and k columns, where the "slot" in row i and column 
j contains the index of the j^^ variable in the i*^ clause. A variable which has £ occurrences in the 
formula will appear in £ slots of the table. So the number of slots in the table that contain variables 
that have £ occurrences is £ x x n^, and the probability that a randomly chosen slot contains a 
variable with £ occurrences is £n£/'^g, fuf. Since the number of variables in the removed clause is 
k, the average number of variables that appear in it (apart from the single-occurrence variable that 
we have chosen to eliminate) and that have £ occurrences is therefore (fc — l)£ng/ J^e' £'ne'- 

We can use Wormald's theorem, which I introduced in Chapter O to write a differential equatior0 
for ni{t), where t = T/N, in the limit N ^ oo: 

^ = ^^-^^ k{^) ^^■22) 

where k{a — t)~ ^"^(^) ^he total number of slots divided by N (remember that exactly k slots 
are removed at each step) . The first term corresponds to the variables that have £ + 1 occurrences 
before the clause is removed, which afterwards have £ occurrences, while the second term corresponds 
to the variables that have £ occurrences before the clause is removed and which afterwards have £ — 1 
occurrences. It is easy to check that this equation can be extended to £ = and £ =1 as follows: 

dn^ (£+ l)nf+i(t) -£ni{t) 

-^ = ^^- H^) + ^ ■ ^^-'^^ 

The initial condition that must be imposed is (|3.2ip 

«.(0) = e-"'=^. (3.24) 

It is easy to see that, for £ > 2, ni remains poissonian even for t > 0, with some time dependent 
parameter which is X(t). To prove it, one just needs to replace the ansatz 



(3.25) 



into p.22p to obtain 

duf dX 



Mt) ~ rii^iit)] = fe(^_^) A(t) K(t) ~ ne-i{t)] (3.26) 



from which one obtains an equation for X{t) independent of - 



Solving it with the initial condition A(0) = ak gives 



A(t) = afc ( 1 - - ) . (3.28) 



a 



detailed derivation is provided in Section l4.1l for a more general case. 
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However, ni{t) is not poissonian, because of the extra S/ i term in (|3.23p compared to (j3.22p . and 
to compute it we use the following trick: 

oo oo oo , 

ni{t) ="^ine{t) -J^^Mt) = - t) - JI^^"^^*'^^ = '^(" ~ ^) ~ [^W ~ ^We"^^*'] (3.29) 
e=i 1=2 1=2 

which can be conveniently expressed in terms of the parameter 6 = (1 — tjoif-^^: 

ni (6) = A(6) [6 + e^^C') - ij (3.30) 

with A(6) = OLkh^~^ . The interval of variation of t is [0, a] (since after aN = M steps all the clauses 
are eliminated from the system), and correspondingly h varies between the initial value 1 and 0. 

There are now two possibilities, depending on the value of a: either ni(6) > for all h £ [0, 1], or 
for some value b* g [0, 1] one has ni{b*) = 0. In the first case the algorithm stops when all the clauses 
have been removed from the system. In the second case, one is left with an irreducible sub-formula 
containing N{a-t*) Na{b*)'' clauses and N J2'Z2'^eib*) = N - N{l-b*)[l + ak{b*)''-^] variables. 
Note that the sub- formula is still uniformly random, conditioned on the distribution ni{b*). 

It is easy to check that the first case occurs for a < ac(fc) where ac{k) is a constant, while for 
a > ac{k) the value of ni vanishes for b* > 0, which is the largest solution of p.30p . I shall denote 
bc{k) the value of b* corresponding to a = ac{k). Numerical values of these constants (and their 
asymptotics for fc ^ oo) are shown in Table I^TTl in the following paragraph. 

Let us now turn to the implications of these results on the original formula. If the first case occurs 
(i.e. if a < ac{k)), one can "invert" the procedure, and reinsert the clauses into the formula one at 
a time, in the reverse order with which they were removed. When the first clause is reinserted, one 
can chose freely the values of /s — 1 variables, and set the last variable to the value which satisfies the 
clause. In general, when one reinserts a clause containing j "new" variables, the value of j — 1 of them 
is set arbitrarily, and the last one is set to the value which satisfies the clause. Notice that since each 
removed clause contained a variable which had a single occurrence at the time when it was removed, 
each reinserted clause will contain at least one new variable. One can then obtain a solution to the 
original formula in this manner. 

What is the number of solutions that one obtains? Not counting the variable which has been 
selected for removal, the average number of single-occurrence (i.e. "new" ) variables present in the 
clause removed at time t is (fc — l)jii{t)/[k{a — t)]. For each of them two values can be chosen. The 
number of solutions Af is therefore 

AA^2^^ s= f ^^^^^^^dt + e-^" (3.31) 
Jo k(a - 1) 

where the last term comes from the variables which do not appear in the system. The integral is 
easily done recalling that for a < ac{k) one has t* = a and substituting (|3.28|) and (|3.29|) to obtain 
s = 1 — a, as expected from p.6p . 

In the second case, for a^k) < a, the leaf removal procedure ends with a sub-formula with a 
clause to variables ratio a' given by 

(3.32) 



1 - {1 - b*) [1 + ak{b*)''-^] 



which is an increasing function of a. The original formula is SAT if and only if the sub- formula is also 
SAT, and we would like to know if this is the case, depending on the value of a'. As we have seen in 
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the calculation of the bound from the first and second moments (|3.20p . the upper bound as{k) < 1 is 
independent on the distribution of random instances, while the lower bound ao(fc) < as(fc) depends 
on it. The computation of the lower bound must therefore be adapted to a distribution of instances 
which is uniform conditioned on the average numbers of occurrences {ni{b*)} which is for £ = 1 
and poissonian with parameter X{b*) for £ > 2. This is done in a detailed manner in [57j . The 
result is remarkable: in the absence of single-occurrence variables, the average number of solutions 
becomes a concentrated quantity and (A/"^) = (A/")^, so that the bounds from first and second moments 
inequalities become tight: ag{k) = 1. 

This proves that there is, indeed, a sharp transition between the SAT and the unsat phases, and 
the transition value of a is obtained from the condition 



a{b*)'' 
1 - (1 - b*)[l + ak{b*)''- 



1 = «' = — — ' w n (3.33) 



(notice that b* is itself a function of a, determined by (|3.30p ). 
The average number of solutions of the sub-formula will be 

j^i ^ 2^'(i-"') = 2^{''*-«(''*)'+«^[(^*)'-(f'*)'"']} . (3.34) 

For each solution of the sub-formula, which I shall call "seed", the number of solutions of the 
original formula that can be obtained is still given by (|3.3ip . where now t* = ol\V — {b*)^\. 

Mx = 2^^{i-''*+"fe[(''*)'""'-(^')1-«[i-(&*)1} (3.35) 

where the subscript 1 is a reminder that this is for a fixed seed. Since for different seeds one necessarily 
obtains different solutions of the original formula, the total number of solutions is 

= AA' X AAi = 2^(1-") (3.36) 

as expected. 

It is possible to prove the following properties (or at least, to give some non-rigorous arguments 
supporting them, see [571155] '): 



1. The average distance do between different solutions corresponding to the same seed is 

1 -b* 



(3.37) 



2. The average distance d\ between solutions corresponding to different seeds is 



rfi = i (3.38) 

3. The maximum distance between solutions corresponding to a same seed is smaller than the 
minimum distance between solutions corresponding to different seeds 

4. For any two solutions X and X' corresponding to the same seed, there exists a sequence of 
solutions Xi, . . . , Xp such that X\ = X, Xp = X' and the (intensive) distance between Xi and 

is of order o(l) as iV ^ oo. 
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k 


Q!c(fc) 


as{k) 


6c (fc) 


3 


0.81846916 


0.91793528 


0.71533186 


4 


0.77227984 


0.97677016 


0.85100070 


5 


0.70178027 


0.99243839 


0.90335038 


6 


0.63708113 


0.99737955 


0.93007969 


oo 


\ogk/k 


1 - e-'= 


1 — 1 / fc log k 



Table 3.1: Threshold values for the clustering and SAt/unsat transitions and backbone size (at 
the clustering transition) for various values of k and (to the leading order) for fc ^ oo. 



3.2.3 Phase diagram of A;-xorsat 

Based on the previous analysis, the following phase diagram can be determined. Each statement 
is valid with probability 1 in the limit N ^ oo for random /c-xORSAT formulae extracted from the 
uniform distribution and with fc > 3. 

The phase diagram of fc-xORSAT consists of three phases, dependent on the ratio a of clauses per 
variable, separated by sharp transitions located at ac(fc) (for clustering) and as(fc) (for SAt/unsat). 
Numerical values of the thresholds for finite fc, and their asymptotics for fc oo are shown in Table [XT] 

For a < ac(fc) the formula is SAT and the solutions are homogeneously distributed in the space of 
configurations. Two random solutions are at an (intensive) distance d ~ 1/2, and they are connected 
by a sequence of solutions separated by a distance of order o(l). The total number of solutions is 
given by (|3.6p . 

A/" =2^(1-"). (3.39) 
The value of the threshold ac(fc) is the smallest value of a such that the equation 

6=1- e-'^fe''"^' (3.40) 

has a solution with 6 > 0. 

For ac(fc) < a < Q!s(fc), the formula is SAT and the solutions are clustered. Each cluster is 
identified by a particular solution of the sub-formula generated by the leaf-removal algorithm, called 
a seed. The solutions belonging to a same cluster are connected, the average distance between two of 
them is (1 — 6*)/2 and their number is given by (|3.35p : 

AAi = 2^{i-''*+"*'-[(''*)'"'-('''']-"[i-(''*)1} (3.41) 

where b* is the largest solution of ()3.40p . which represents the fraction of variables that take the 
same value in each solution of a given cluster and is called back-bone size. The solutions belonging to 
different clusters are well separated, the average distance between two of them is 1/2 and the number 
of clusters is given by (|3.34p : 

A/"' = 2^'(i""') = 2^{''*""('''''+"''[(''')'"'"(''')'"'"']} . (3.42) 

The threshold value as{k) is given by the condition (|3.32p : 

a{b*)'' 



1 - (1 - b*) [l + ak{b*) 



fe-i" 



= 1 . (3.43) 



For as(fc) < a the formula is unsat. Note that as a — > Q;s(fc) from below, the entropy of the 
number of cluster goes to 0, i.e. the number of clusters becomes sub-exponential in N, while the 
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Figure 3.1: Total entropy s{a) (full line) and entropy of the number of clusters a-{a) (dashed line) as 
functions of a for 3-xORSAT. The curve for a{a) starts at a = Q!c(3) ~ 0.818. The curve for s{a) ends 
at a = as{3) ~ 0.918, where a-{ct) ~ 0. The right hand panel is an inset of the full figure, on the left. 

number of solutions (inside each cluster) remains exponential in TV, and discontinuously jumps to 
as a crosses as{k). 

The entropies (i.e. \ogAf/N) of the number of clusters and of the (total) number of solutions are 
shown in Figure [57T] for fc = 3. 

3.3 Heuristic results on the phase diagram of /c-sat 

The simple graph-theoretical arguments that allow the complete and rigorous characterization of the 
phase diagram of fc-xORSAT do not apply in the case of fc-SAT. Not only the methods required to 
derive it are more complicated (and not rigorous), but the phase diagram itself is more complicated. 

Sat/Unsat transition 

The existence of a SAt/unsat transition in fc-SAT has been proved rigorously, but the proof of its 
sharpness remains an open problem. In fact, the following was proved by Friedgut |59j : 

Theorem For each fc > 2, there exists a sequence aAr(fc) such that, for all e > 0, 



where P[Sat| A^, a] is the probability that a uniformly random fc-SAT formula with N variables and aN 
clauses is satisfiable. 

Note that this theorem proves a non-uniform convergence: the threshold value is a function of N , 
which does not necessarily converge to a constant. This theorem doesn't imply that the SAt/unsat 
transition is sharp, but it proves that it exists. The sharpness of the transitions remains a conjecture. 

Rigorous upper and lower bounds have been proved for the threshold as(fc) for finite fc and asymp- 
totically as fc cx) (for a review and latest results, see [5^). Some values are listed in Table [321 

Finally, the best available estimates of the value of as(fc) have been obtained with methods derived 
from statistical mechanics: the analysis of a message passing procedure called Survey Propagation 
(SP), which is based on the cavity method [6l]. Some values obtained from the analysis of SP are 
reported in Table 




(3.44) 
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k 






a7{k) 


3 


3.52 


4.267 


4.51 


4 


7.91 


9.931 


10.23 


5 


18.79 


21.117 


21.33 


00 


2'= log2 - k 


2^= log2-6fc 


2*= log 2 



Table 3.2: Threshold values for the SAt/unsat transition in fc-SAT. Q.~{k) is a rigorous lower bound, 
a*{k) is the prediction from the cavity method, and a^(fc) is a rigorous upper bound. For fc — > oo the 
rigorous bounds are exact, while in the result from the cavity computation, bk is a positive function 
of k which converges to (1 + log2)/2 as fc oo. From [501 [CT] 



k 


ac(fc) 


aCond(fc) 


as{k) 


3 


3.86 


3.86 


4.267 


4 


9.38 


9.547 


9.931 


5 


19.16 


20.80 


21.117 



Tabic 3.3: Threshold values for the clustering (ac) and condensation (acond) transitions in fc-SAT. 
The values of as{k) from Tab 13.21 arc repeated for comparison. From |65j . 

Clustering transition 

The satisfiablc phase of fc'-SAT has a very rich structure, presenting several phase transitions that 
concern the geometry of the satisfying assignments. The first such transition is the clustering one. 

The definition of the clustering phenomenon itself is much more complicated for fc'-SAT than for 
fc-xORSAT. As we have seen, clustering in fc-xORSAT has a geometrical origin: the set of variables of a 
formula can be decomposed in two; the backbone, made of variables that are determined by solving 
a sub-formula of the original problem; and the leaves, that are free to take any value in any solution. 
This structure naturally implies the clustering of solutions, and also two properties of the clusters: 
first, that all clusters contain the same number of solutions; second, that the variables that are frozen 
inside a cluster arc the same for all clusters. 

In fc-SAT, these two properties do not hold. The fact that the variables that freeze in different 
clusters are not the same requires a definition of clusters independent on the backbone. This can be 
done by defining the clusters as a partition of the solutions such that: 

1. The distance between any pair of solutions belonging to different clusters is larger than the 
distance between any pairs of solutions belonging to the same cluster; 

2. For any pair of solutions {X, Y) belonging to the same cluster, a sequence of solutions {Xi, . . . , X„} 
can be made such that Xi — X, X„ = Y and the distance between Xi and X^+i is of order 
0(1) (as N oo). 

This approach is followed in [621I63], where rigorous results are obtained for fc > 8. Moreover, non- 
rigorous results based on the cavity method are available for any k [HI], and arc reported in Table 

Notice, however, that the clustering phenomenon was first suggested for fc-SAT in [SI], where a 
"variational" replica calculation was performed: based on physical intuition, a simple trial function 
with few free parameters was used as the functional order parameter for the free energy, as in (|1.64p , 
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and the values of the parameters where set by finding the extremum of the corresponding entropy. 
With this method, an approximation to the RSB solution which describes the clustered phase was 
found. This led to the calculation of approximate values of the clustering threshold ac{k). In the 
same paper, the other difficulty arising in fc-SAT, i.e. the fact that different clusters have different 
sizes, was pointed out. 

It is a very important fact, as it gives rise to two more phase transitions. 

Condensation and freezing transitions 

Let us denote, as usual, the entropy of the number of clusters by S, the internal entropy of a cluster as 
Si and the total entropy as s. Each of them is defined as the logarithm of the corresponding number 
of objects divided by N. When different clusters have different sizes, a convenient way of accounting 
for them is to write S as a function of s^: S(si) is the entropy of the number of clusters that have 
internal entropy Si. The total entropy is then 



The measure of the number of solutions will be dominated by the maximum of the integrand, i.e. by 
the value 



At the clustering transition ac{k), the complexity S(s*) becomes positive: the space of solutions 
splits into an exponential number of well separated clusters, each containing an exponential number of 
solutions. As a grows, the number of solutions decreases. More specifically, it is S(s*) which decreases, 
and for a = Q;cond(^) < cts(fc), it vanishes. When this happens, both the number of solutions and the 
number of cluster are still exponential; however, the measure of the number of solutions is dominated 
by a sub-exponential number of clusters, corresponding to the largest Si. As a increases further, the 
value of the maximum of S(si) decreases, until it vanishes at a = Q;s(fc), the SAt/unsat transition. 
When this happens, the number of solutions vanishes abruptly, with a discontinuity in s^. 

Still another phase transition occurs for intermediate values of a, corresponding to the freezing 
of variables within a cluster. For ac(fc) < a < af{k), there are no frozen variables (even within a 
cluster), while for af{k) < a frozen variables are present 




(3.45) 



s* : S'(sn = -1. 



(3.46) 



Part II 

Some properties of random /c-SAT 
and random A:-XORSAT 
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Chapter 4 



Study of poissonian heuristics for 
DPLL in A:-XORSAT 



In this chapter I shaU present some new results on the relationship between the clustering transition 
of fc-xORSAT and the performance of DPLL algorithms, obtained with Remi Monasson and Francesco 
Zamponi and published in |67| . 

It is generally believed that local algorithms cannot succeed (in finding solutions) in the clustered 
phase of random CSP. In this context "local" means that the algorithm decides assignments based on 
local information, such as the values of variables within a finite subset of clauses. Local algorithms 
therefore include, for example, search algorithms such as Metropolis or WalkSAT, and also the DPLL 
procedure. The basic argument supporting this belief is that in the clustered phase an extensive 
back-bone of frozen variables exists, which requires an extensive number of variables to take values 
that are strongly correlated. An optimization procedure which only takes into account a finite portion 
of the problem will not be able to find a correct assignment for the back-bone, and therefore for the 
problem. 

An alternative argument is directly derived from spin glass theory: the free energy landscape of 
random CSP in the clustered phase is characterized by a large number of states, most of which have 
positive energy, separated by extensive barriers. In order to go from a random configuration to a 
ground state the system must cross these barriers, which a local optimization procedure cannot do. 
If this argument is plausible for search procedures, which perform a random walk in the space of 
configurations while trying to minimize some cost function, and which therefore can indeed remain 
trapped in local minima of the free energy, it is not at all clear why it should apply to the DPLL 
procedure. Indeed, the only evidence supporting this claim for DPLL is that no heuristic is known to 
succeed in the clustered phase. 

The main result that I shall present in this chapter is that no DPLL heuristic which preserves 
the poissonian distribution of occurrences in the sub-formula3 it generates can find solutions in the 
clustered phase. The essence of the argument, as we shall see, is related to the geometrical properties 
of the graph underlying the formula (which allow the use of the leaf-removal procedure to characterize 
the phases), and to the very basic fact that a Unit Propagations cannot remove more than one clause 
for each variable that they assign. 

This result is valid for random fc-xORSAT formulae extracted from the uniform distribution (with 
probability 1 as — > cxd, as usual). It is worth noting that it can be extended to a generalization 
of fc-xORSAT which goes under the name of Uniquely Extensible Constraint Satisfaction Problems, 
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or UE-CSP. In these problems, variables can take values in a set of cardinality d, and the form of 
the constraints is such that k variables appear in them, and that if any k — 1 variables appearing 
in a constraint are assigned, then the value of the k^^ variable is determined. It is very interesting 
that {d, fc)-UE-CSP is NP-Complctc for {d > 4, A: > 3}. fc-xORSAT is a special case of (d, fc)-UE-CSP 
with d = 2. However, as far as the DPLL procedure is concerned, the class of {d, fc)-UE-CSP is 
equivalent to fc-xORSAT for any k and d, since the only relevant feature for the sake of DPLL is that 
Unit Propagations be possible, and the characteristic property of UE-CSP 's ensures that it is. As 
a consequence that there will be a sharp transition between a phase with a back-bone and a phase 
without it, which will occur for some add, k), and that all the results that wc shall derive concerning 
the performance of DPLL will be valid for {d, fc)-UE-CSP as well. 

The structure of this chapter is the following: in Section 14.11 1 shall introduce a generalization 
of the leaf-removal procedure to mixed formulae; this will allow me to introduce a potential function 
that characterizes the phases of mixed formulae, in Section [4?2l in Section l473l I shall characterize the 
trajectories that poissonian heuristics generate in the space of the density of clauses {cj}; then I shall 
derive an upper bound for the values of a for which poissonian heuristics for DPLL can find solutions, 
in Section [mi in Section l¥751 1 shall present an argument supporting that GUC saturates the previous 
bound in the limit fc — > oo. ; finally, in Section 14.61 1 shall discuss the results obtained and indicate 
some possible directions of further investigation. 

4.1 Leaf-removal for mixed formulae 

In paragraph 13.2.21 1 described the leaf-removal procedure applied to a pure fc-xORSAT formula, that 
is to say a formula in which all the clauses involve exactly k variables, as was introduced in [561 157] . 
The leaf-removal procedure is extremely powerful, as it provides a full characterization of the phase 
diagram of fc-xORSAT. In this Section I shall generalize the analysis of the leaf-removal procedure to 
the case of mixed formulae, containing clauses of different lengths (where length stands for the number 
of variables in the clause), in order to allow the characterization of the sub-formulae generated by 
DPLL heuristics. 

4.1.1 Leaf-removal differential equations 

Let us consider a random XORSAT formula with N variables and a total of M clauses of different 
lengths j ~ 2,3, . . . , k. We don't consider clauses of length 1 since they arc trivial, and wc denote by 
k the maximum clause length. The number of clauses of length j will be denoted by Cj{0), where 
the indicates that this is the initial formula (relative to the action of the leaf- removal) , and we 
will have M = Y7j^2 Cj(0) = for 

some finite a. We shall also denote by A^^(O) the number of 
variables with I occurrences, and therefore 'YltLo ^ ~ ^- We assume that the formula is formed 
by selecting uniformly at random the index of the variable appearing in each "slot" of each clause 
(with no repetitions within a clause) . The distribution of the number of occurrences of the variables 
in the formula is then a poissonian with parameter A(0). Notice that the distribution of occurrences 
is independent on the clause lengths (i.e. the distribution of occurrences in clauses of length j is the 
same for all j). 

The leaf-removal proceeds in steps. Let us denote by T the number of steps that have been 
performed. At each step, a single-occurrence variable is selected, and the clause in which it appears is 
removed. What is the probability p(j) that a single occurrence variable appears in a clause of length 
j? By definition, a single-occurrence variable occupies a unique slot in the formula. Since each slot 
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can contain any variable with uniform probability, p{j) will be proportional to the fraction of slots 
that belong to clauses of length j: 

If we denote by Cj (T) the number of clauses of length j after T steps of leaf-removal, we shall have 

E[Q(r + 1) - Q(r)] = -pij) = - ^^^cJt) ■ ^^-^^ 

Moreover, if the removed clause has length j, the average number of variables (excluding the one to 
be eliminated) with £ occurrences that appear in it will be {j — \)lNi{T)/ £'^i' , and therefore 

¥\Nt{T + 1) - Nt{T)\]\ ^ [J - 1) ^ + Se.o - di^i (4.3) 

where the Kronecker deltas come from the single occurrence variable being eliminated. Multiplying 
by p{j) and summing over j, 

E[Ne{T + l)-Ne{T)] = J2p{j)E[Ne{T + 1) - Ne{T)\j] (4.4) 



j=2 
k 



= UT-JcW)^'-'^ Yjmr) +^^,o-^.r. (4.5) 

In the limit N ^ oo the variations in (|4.2p and ()4.5p are of 0(1), and we can apply Wormald's theorem 
to obtain the following differential equations for ni,{t) = ¥.[Ni{Nt)/N] and Cj{t) = ^[Cj{Nt)/N]: 

( dcj{t) _ jcj{t) 



dt 



k 



(4.6) 



dnejt) j(j-1)cjW (£ + l)n^+i(t) - fa,(t) , ^ 

The initial conditions for Cj{t) arc trivial (i.e. Cj{0) — Cj{Q)/N), while those for n; are: 

n,(0) = e-(«)^. (4.7) 
Since the parameter of the poissonian coincides with its average, we shall have 

oo k 

A(0)=5]^M0)-5]jc,(0) (4.8) 

i=0 j=2 

where the last equality comes from the fact that the two sums in (|4.8p give the number of slots in the 
formula, and therefore are equal. 

4.1.2 Solution for Cj{t) 

In order to solve (|4.6p . we observe two things: first, that the equations for {cj{t)} are independent on 
{ne{t)}; second, that the equation for Ck(t) implies that, as long as Ck{t) > 0, it is a strictly decreasing 
function of t, and therefore Ck can be used as an independent variable instead of t. We then divide 
the equations for Cj with j = 2, 3, . . . , /c — 1 by the equation for Ck, obtaining 
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which admits the solution 

c,=cof5)''' (4.10) 



6(0 -I^V^' (4-11) 



where is the value of cj at t = 0. It is convenient to introduce 

' cfc(t) 

MO) 

so that (|4.10p becomes 

c,{t) = cMKty ■ (4.12) 

Notice that b is an invertiblc function of Cfc, and therefore of t. 

Let us also introduce the generating function 7(6) of the Cj(0), which will play a very important 
role in the following: 

k 

7(6) ^^c, (0)6^" (4.13) 

so that 

k k 

7(6(t))-^c,(0)6(i)-'=^c,(i) (4.14) 

i=2 j=2 

is the total number of clauses at time t and 



^ 5]jc,(t) = ^£n,(t) (4.15) 



b{tW{b{t)) ^ b{t) 

«0 b=b{t) j=2 

is the number of slots in the formula at time t. Since exactly one equation is removed at each step, 
one must have 

7(6(t)) = a-t (4.16) 
which implicitly defines b{t) through (|4.13p : 

fc 

t{b)=a-^Cj{0)b' . (4.17) 

4.1.3 Solution for ni{t) 

We can now write the equations for {ni{t) | ^ > 2} in ()4.6|) as 

dn, _ Y{hy ,,,,,, , , i'{h) 



dt [7' (6) 6]^ 



[(£ + l)n,+i ~ Im] = ^ 

b=b{t) 



[{l+\)n,+^-ln,] . (4.18) 

b=b(t) 



As in the case of pure fc-xORSAT formula3 the distribution of occurrences (for £>2) remains poissonian 
at all times: 

n,(i) = e-^(*)^ (£>2). (4.19) 

This is easily seen by substituting this expression in (|4.18p . which gives an equation for A which is 
independent on t. 

dX 7" (6) 



dt 7' (6) 2 

where b = b{t). This is solved by noticing from (|4.17p that 



A (4.20) 



I - -m (4.21) 
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so that 



dX 7" (6) 



A 



(4.22) 



db 7' (6) 

with the initial condition that for < = 0, which corresponds to 6 = 1, A must be equal to A(0) = 
jCj(O). The solution is then: 



and we obtain 



ne{b) 



.-7'(b) 



i£>2) 



(4.23) 
(4.24) 



with b = b{t) obtained by inverting i = a — ^(b). 
For £ = 1 we write 



ni 



(6) = ^£n,(6) -J2^Mb) = bj'ib) 



-7'(h)V 



7'(6) 



The leaf-removal will end when 'ni{b) ~ for some b € [0, 1], which gives: 

6 = 1 - e-'''^'') . 



(4.25) 



(4.26) 



Let us denote by b* the largest solution of this equation. If 5* =0 the leaf-removal removes 
all the clauses from the formula, which is SAT (with probability 1). and the solutions are unclus- 
tered. If 5* > the leaf-removal ends with an irreducible sub-formula. The number of clauses 
in the sub-formula is N^^-2Cj{t*) — Nj{b*) and the number of variables is N^'^^'^ii^*) = 



jVe-T'C'*) e'''(^*) - 1 -7'(5*) 



The sub-formula is SAT if and only if the number of variables is 

(4.27) 



smaller than or equal to the number of clauses: 

7(5*) <6* + (l-&*)log(l-5*) 
where we have used (|4.26p . The SAt/unsat transition occurs when this bound is saturated 



4.2 Characterization of the phases in terms of a potential 

4.2.1 Definition and properties of the potential V{h) 
Let us define the following potential, which is a function of b: 

V{b) = -7(6) + 6 + (1 - 6) log(l - b) . 

The derivative of V{b) is 



V'{b) - -7' (6) - log(l - b) , 
and we see that for b = b*, which verifies (|4.26p <^ I'ib*) = — log(l — b*), we have 

V'{b*)=0. 



(4.28) 
(4.29) 



(4.30) 



The value of b* can therefore be obtained from V(b), looking for the largest value in [0, 1] where the 
derivative of V vanishes. 

In the unclustered phase, V{b) has a unique minimum at b* = 0. As a grows, a secondary minimum 
develops for 6* > 0. The clustering transition occurs when this secondary minimum forms, and when 
this happens one must have V"{b*) = 0. On the other hand, from (|4.27p and (|4.28p one sees that the 
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Figure 4.1: Potential V{b) for different formulae. Each was obtained by applying the UC heuris- 
tic to a 3-xORSAT formula with a = 0.8 for different times: from top to bottom t = {0, tc = 
0.02957, 0.07327, ts = 0.11697,0.20642}. The first curve shows that the formula is in the unclustered 
phase; the second curve corresponds to the clustering transition; the third to a clustered formula; the 
fourth to the SAt/unsat transition; finally, the formula is unsat. 

SAt/unsat transition occurs when V{b*) = 0. As in the pure case of paragraph 13.2.21 b* is the size 
of the back-bone, i.e. the fraction of variables that take the same value in each solution of a given 
cluster. 

It is therefore possible to characterize the phase to which the formula belongs in terms of V{b): 

Back-bone size: b* = max {b : V'(b) = 0} (4.31) 

6e[o,i] 

Clustering transition: V"{b*) = (4.32) 

SAt/unsat transition: V{b*) = (4.33) 

An example of potential is provided in Figure im The formula? considered for each curve are generated 
by the UC heuristic applied to a 3-xorsat formula with a = 0.8. Each curve corresponds to a different 
time during the evolution under UC (more detailed explanations are given in Section [4.3|) . 

Notice that, given an arbitrary set of clause densities {c2, . . . , Cfc}, it is not a priori a trivial task to 
determine whether random formula? conditioned by {cj} are SAT or not, and if they are SAT, whether 
their solutions are clustered or not. However, it suffices to compute V{b) for the given set of Cj's 
and, from its "shape" , the answers to the previous questions become immediately clear. This is what 
makes the potential V{b) such a powerful tool in the study of the phase transitions of fc-xORSAT (and 
of (d, fc)-UE-CSP). 

Interestingly, a "potential method" was already well known in mean field theory of spin glasses. 
It was originally introduced by Parisi [68] and developed by him and Franz [69l [TOl Ell [72] and by 
Monasson [72] ■ "Their" potential is derived in a completely different way: it is defined by considering 
two real replicas of the system (i.e. two identical samples), with an interaction term that depends on 
the overlap q between their configurations. The first replica is allowed to equilibrate at temperature T, 
without "feeling" the effect of the coupling, while the second replica equilibrates at the same temper- 
ature but is subject to the interaction. The effect of the interaction is to constrain the configurations 
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of the second replica to those that have a fixed overlap with the equilibrium configurations of the first 
one. The potential V{q) is then defined as the free energy of the second replica as a function of q, in 
the limit in which the interaction strength vanishes. 

Even though the potential V{q) is defined in a completely different way from the potential V{b) 
defined in this Section, the two share many common features. First, both are functions of the overlap 
(?, or equivalently of the fraction of frozen spins 6; second, the properties of their minima determine 
the phase transitions of the system (in this regard, the definition of V{q) as a free energy is much 
more transparent); third, the value of the potential corresponding to the secondary minimum (when 
it is present) is equal to the complexity. In fact, it should be possible to prove that the two potentials 
are actually identical by computing the full expression of the IRSB free energy of fc-xORSAT in the 
case of a mixed system, and deriving the explicit expression of V{q) in the most general case. The 
fact that the same potential can be obtained following two approaches that are so different is a very 
interesting fact in itself. 



4.2.2 Phase diagram for mixed /c-xorsat formulae 

For pure formulae the phase diagram depends on a single parameter, a. For mixed formulae, the phase 
diagram is more complicated, as the space of parameters is C = {c2, . . . ,Cfc} which has dimension 
fc — 1. Each one of the Cj's varies in [0, 1], because if some Cji > 1 then the sub- formula containing 
only the clauses of length j' is UNSAT(and therefore so is the complete formula). For any point c G C 
we can compute the potential V{b), which depends on c through 7(6) = X]j=2 '^j^ ^ ^^'^ define 
b* as the largest solution of 6 = 1 — e^'^ '-''^ in [0, 1]. 

The phase transitions are characterized by the conditions (|4.32p and (|4.33p . The boundary between 
the unclustered and the clustered phase will be the (fc — 2)-dimensional surface Sc defined by: 

Sc = {c e C : (6* > 0) A {V"{b*) = 0)} . (4.34) 

The boundary between the SAT and the unsat regions in C will be the (fc — 2)-dimcnsional surface Eg 
defined by: 

= {c e C : (6* > 0) A {V{b*) = 0)} . (4.35) 

Notice that in 6 = one always has V^(0) = V'{0) ~ 0, because the first term in 7(6) is 026^ and 
b+{l-b) log(l - 6) = + 0{b^) for small b. Also, for C2 = 1/2 one has V"{0) = (irrespectively 
of the values of Cj for j > 2). Therefore, for C2 = 1/2, 6 = is formally a solution of V{b) = and of 
V"{b) = 0. Even though the surfaces Sc and Sg are defined with b* > 0, it is possible that 6* ^ if 
the local minimum at fo* > merges into the global minimum oi V in b ^ 0. This can happen if and 
only if V'"{0) = (so that 6 = becomes a "flat" saddle of V), which is obtained for C3 = 1/6 (as 
seen by taking the term in b^ in the above expansions). This implies that the two surfaces Ec and Eg 
intersect on the (fc — 3)-dinicnsional surface Ek of equation: 

Sk = |c e C : (^C2 = A (^C3 = I (4.36) 

The suffix 'k' stands for critical (the 'c' being used for clustering), because Ek is the surface where 
the discontinuous phase transitions (both the clustering and the SAt/unsat ones) become continuous, 
which is traditionally called critical point in statistical mechanics. 

The surfaces Ec, Eg and E^ are tangent to each other. This can be seen by verifying that C2 = 
1/2 - £2 and C3 = 1/6 + e verify g^J), (|05l) and fi:^ with b* = e (to the leading order in e ^ 0). 
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Figure 4.2: Phase diagram of mixed 4-xORSAT. Left A pictorial view of the sm-faces Ec (full black) 
and Eg (dot-dashed red), intersecting on the segment Ek (dashed blue), where they are tangent to 
each other. Going from the origin out, the formula are first unclustered, then clustered (after Ec is 
crossed) and finally unsat (after Eg is crossed). Right The sections of Ec (full black) and Eg (dot- 
dashed red) at constant C2 = {0, 0.1, 0.2, 0.3, 0.4, 0.5} from top to bottom (top panel) and at constant 
C4 ~ {0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7} from top to bottom (bottom panel). The phase diagram for pure 
formulae with fc = 3 is formed by the C3 axis of the bottom panel, which the two curves corresponding 
to C4 = intersect at C3 = a = 0.818 and C3 = a = 0.918 respectively. 

The fact that Ec and Eg have an intersection where they are tangent to each other is not at all clear a 
priori (as we have seen, it depends on the specific form of V{b)), but it will turn out to be extremely 
important in the following. 

As an illustration, the phase diagram for fc = 4 is shown in Figure [HH 



4.3 Trajectories generated by poissonian heuristics 

In Section [T^ I introduced the DPLL procedure and discussed some properties of two specific heuris- 
tics, called respectively Unit Clause (UC) and Generalized Unit Clause (GUC), for the problem of 
(2 + p)-xORSAT. In this section I shall extend the same kind of analysis to more general heuristics 
and to mixed formulae of any maximum length k. 

I shall first define the class of heuristics considered, then derive some general properties of poisso- 
nian heuristics that will be useful in Section [43 and finally analyze the special cases of UC and GUC 
to illustrate them. 
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4.3.1 Poissonian heuristics for DPLL 

Let us consider a DPLL procedure without back-tracking acting on some pure fc-xORSAT formula. For 
the class of heuristics I want to introduce, it is convenient to modify the description of the procedure I 
gave in Section [4. 41 in such a way that unit propagations are performed by the heuristic. The modified 
procedure is described by the following pseudocode: 
procedure Modified DPLL({C2(0), . . . , Cfc(O)}) 
repeat 

Select and assign a variable x according to Heuristic 
Simplify the formula 
until A contradiction is generated or All the variables are assigned 
end procedure 
with the heuristic: 

procedure Poissonian HEURiSTic({pj(Ci, . . . , Ck) | j = 0, . . . , A:}) 
switch With probability po(Ci7 • ■ • , Cfc): 
Select uniformly at random a variable x 
Assign X to TRUE or false uniformly at random 
otherwise With probability 1 — ^0(^^17 • ■ • 7 Ck)- 

Select at random a clause length j G {1, . . . , /c} with probability pj{Ci, . . . ,Ck) 
Select uniformly at random a clause C of length j 
Select uniformly at random a variable x appearing in C 
Assign X to TRUE or false uniformly at random 
end switch 
end procedure 

where pj{Ci, . . . , Ck) with j = 0, . . . ,k are functions that characterize the heuristic. The Unit Prop- 
agation rule then simply requires that Pj{{Cj}) = Sj^q if Ci > 0. Notice that {Cj} are the extensive 
numbers of clauses of length j in the specific formula we are considering (they are not averaged over 
the distribution of formula). Moreover, since the alternatives corresponding to different values of 
J = 0, . . . , A: are independent, it is possible to normalize the probabilities so that 

k 

J2pACu---,Ck) = l. (4.37) 

It is easy to see that UC and GUC are special cases of this class of heuristics: 
Tiri I Si 1 if Ci > 0: 

py^({Q}) = <! ; (4.38) 

Ojfi otherwise. 

r,TTr< I Si 1 if Ci > 0: 

Pr^({Q}) = , (4.39) 

I I[j is the Icnght of the shortest clause in the formula] otherwise. 

A very important property of this class of heuristics is that the sub-formulffi that it generates are 
uniformly distributed, conditioned on the numbers {Cj} of clauses of length j. As a consequence, the 
distribution of the number of occurrences of variables will remain poissonian under the action of these 
heuristics, even though the parameter of the poissonian may vary. This is the reason why I call this 
class of heuristics poissonian. In fact, I believe this to be the most general class of heuristics which 
preserve the uniform distribution of the sub-formulae it generates (even though I am unable to support 
this claim). 
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p,{t) {j = l,...,k) (4.40) 



Because of this property of the heuristics it is possible to analyze them in terms of differential 
equations, as we did for UC and GUC in Section 12.41 We define the time t — T/N where T is the 
number of variables that have been assigned, and the average clause densities Cj{t) ~ WyCj{Nt)/N]. 
The initial condition for the equations will be Cj(0) = ot^j^k- Under the action of the heuristic, the 
formula will trace a trajectory in the space {cj} C [0, 1]*^^^. The dimension of the space is fc — 1 
instead of k because if at any time ci {t) > the procedure generates a contradiction with probability 
1 and it fails. 

For notational convenience, I shall introduce Ck+i{t) = and pk+i{{Cj}) = 0. An analysis similar 
to that carried out in Section [2^ for GUC then shows that the differential equations that determine 
{cj} are the following: 

dcj ^ a + l)cj+ift) -jcj{t) 
dt 1-t 
where 

tN+AT~l 

hm — hm h({Cy(r)})-p,+i({Q'(r)})] (.7 = l,...,fc) (4.41) 

T=tN 

is (minus) the average variation of Cj due to the the algorithm selecting j + I or j as the length for 
the clause from which to pick the variable to be assigned. In this equation AT is a number of steps 
of order o{N), so that Cj{t) can be considered constant over AT, and which is a generalization of the 
"round" I introduced in the analysis of GUC. Notice that pj{t) depends on t only through {cj{t)}. 

The first term in (|4.40p is due to the other clauses of the formula in which the selected variable 
appears: on average, there will be {j + l)cj+i/(l — t) of them of length j + 1 (which will become of 
length j) and jcj{t)/{l — t) of length j (which will become of length j — 1). 

Since the density of unit clauses in the formula is always 0, for j ~ 1 (|4.40p reduces to 

which gives the explicit expression of pi (t) required to ensure Unit Propagation. The condition that 
signals the appearance of contradictions with probability 1 is 

M^) = f^-1. (4.43) 
I shall define one more (k — 2)-dimensional surface in the phase diagram: 

Eq=|ce[0,l]'=-i : £2 = i| (4.44) 

where the 'q' stands for contradiction (the 'c' being very much in demand...) and where the tilde 
reminds us that these clause densities are normalized to the number of variables in the sub-formula, 
i.e. Cj = Cj/{l~t). 

A final remark to conclude this paragraph: since the distribution of occurrences remains poissonian 
at all times, the results of the previous section allow to characterize the phase to which the sub-formulae 
generated by the heuristics belong. The only difference is that the clause densities cj (t) are normalized 
to the number of variables N in the initial formula, so the definition potential must be modified as 
follows: 

Vib,t,a) ^ y^^'^'f +b+{l~b)log{l-b), (4.45) 

k 

7(6, i, a) ^ Vc, (t)6^ (4.46) 
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where the sum over j can be extended to include 1 because Ci{t) = 0. ^ depends on t and a through 
7 and therefore through the {cj{t)} (which depend on a because of the initial condition). One should 
be careful not to confuse the time t which appears in these equations with that introduced in the 
description of the leaf-removal of Section 14.11 t is the fraction of variables appearing in the original 
formula that have been assigned to obtain the sub-formula, to which the leaf-removal can then be 
applied. 

In equation 14.451 the prime in ^'{b^t,a) denotes the partial derivative with respect to b. In the 
following I shall always denote derivatives with respect to b with primes, and derivatives with respect 
to t with dots (e.g. j{b,t,a)). Derivatives with respect to a will be written explicitly. 

It is convenient to supplement (|4.45|) and (|4.46p with the generating function of the {pj{t)}: 

k 

<^(6,t,a)=^p,(t)&^' (4.47) 
which will play an important role in the following. 



4.3.2 General properties of poissonian heuristics 

The rate at which clauses are removed from the formula is given by 

k k 

- E ^0 (i) = -7(1, «) - E Pi (^) (4.48) 

where the "telescopic" terms (j + l)cj+i — jcj in Cj cancel each other. Since at each time step at most 
one clause is removed from the formula, one must have 

-7(l,t,a)<l. (4.49) 

This bound is saturated when pi{t) — 1 which is the condition for the onset of contradictions. 
Moreover, we can multiply (j4.4ip by j and sum over j to obtain 

k Wt+At-1 k 

Y,JPj{t) = cP'{l,t,a)^ Jm^—lhn^ ^ Y.Pjm{m<l (4.50) 

J = l T=Nt j=l 

because of the normalization condition (j4.37p . 

More generally, if we denote the average over AT which appears in (|4.4ip and (|4.50p with angled 
brackets (•), we have 

Pj{t) ^ {Pj) - {p,+i) (4.51) 

where each (pj) is non-negative and they are normalized so that X)j=o (Pj) ~ ^ (because each term in 
the sum defining the average over AT has these properties). Then we have 

0(6, t, a) = E Pi i^)^ = E '^Pi) E (Pj-) ^ b E (Pi) ^ ^ (4.52) 
j=i i=i j=2 j=i 

since b G [0, 1]. Moreover, 



k k 
^'{b,t,a) = ^jp,(t)6^-i = (pi) +^5^-2 [1-jil-b)] {pj) . (4.53) 
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The coefficient in front of (jpj) in the terms of the sum is maximum for 6 = 1, independently of j , and 
is then equal to 1, so that 



b'{h, t, a) < (pi) + (Pj) = 1 - (Po) < 1 ■ 
i=2 



(4.54) 



These two bounds will be extremely useful in order to characterize the trajectories traced by 
poissonian heuristics. To do that, for each value of a in the original formula, we can define the 
three times tc(Q^), ts{a) and tq{a) at which the reduced sub- formula: cross respectively the clustering 
transition surface Ec, the SAt/unsat transition surface Eg and the contradiction surface Sq defined 
at the end of Section [42l A priori we could expect the trajectories to cross each surface more than 
once, and in this case we shall consider the times of first crossing. By doing this, we ensure that the 
three functions tx{a) (where 'x' is 'c', 's' or 'q') are invertible, and we can define a^it) as the value of 
a such that = t. On the other hand, it is possible that the trajectory never cross some (or all) of 
these surfaces, in which case the corresponding tx{a) will be undetermined. 

Since the phase transitions are completely characterized by the potential V, these crossing times 
will be determined by the conditions ()4.32p and ()4.33p on V{b,t,a): for given a and t (and therefore 
for given {cj}) we define b* by ()4.3ip as the largest solution of the equation V'{b,t,a) = 0; then 
the clustering time tc{a) wiU be such that V" {b* ,tc,a) = and the SAt/unsat time ts{a) will be 
such that V{b* ,ts,a) — 0. As for the contradiction time tq{a), it is determined by the condition 

2C2(tq)/(l-tq) = 1. 

Let us take the total time derivative of the condition that determines b*, i.e. V' {b* ,t,a) = 0: 



0. 



(4.55) 



The term in db* /dt is present because when t changes, so do the values of Cj{t) and therefore the 
coefficients in the power series that defines V , and the point where its derivative vanishes moves. In 
the same manner, if b* is held fixed, then as t varies the only remaining parameter must vary as well, 
and this is a, which gives rise to the term in da/dt. 

At the clustering transition, a = ac{t) and 6* = &*, the condition V"(65, ac(i)) = is verified, 
so that the previous equation becomes 



dac{t) 
dt 



V'{K,t,a,) 
dc,V'{b*,t,a,) 



(4.56) 



where, let me stress it again, ac = ac{t) is the value of a such that the trajectory crosses Ec at time 
i, and where the dot denotes a partial time derivative. 
From the definition (j4.45p we have: 



V'{b,t,a) 



l~t 



i'{b,t,a) + 



l'{b,t,a) 



1 - t 



daV'{b,t,a) = -J— ^9q7'(&, a) ■ 



We can substitute these two expressions into ()4.56p to obtain: 
da,it) 7'(6,t,a)+7'(6,t,a)/(l-t) 



dt 



daYib,t,a) 



(4.57) 
(4.58) 

(4.59) 



4.3. TRAJECTORIES GENERATED BY POISSONIAN HEURISTICS 



69 



From the equations of motion of the heuristic (|4.40p we obtain the foUowing equation for 7: 



7(6, t, a) 



1 - b 



1 - 1 



1 - 1 



l\b,t, a) - (l>{b,t,a) . 



Differentiating it with respect to b we have: 



7'(6, t, a) = - t, a) + ]^p"ib, t, a) - (b'{b, t, a) 



For b ~ b*^{t) we shall have V = V" = 0, and since 



we get 



1-6 



1 - 1 



y'{b,t,a) = 1 



J b=b^{t),a=ac{t) 

so that the numerator in ()4.59p becomes 1 — 1 ^1 ^^c) and we obtain: 



dt 



1 - (f>'{b,t,a) 



b=b*{t},a=aa{t) 



(4.60) 

(4.61) 

(4.62) 
(4.63) 

(4.64) 



This is where the bounds (|4.52p and (|4.54p are important (actually, it's only the second of the 
two which is used here): since 0'(6, t, a) < 1 for any b,t,a, the numerator is surely positive or null. 
Moreover, the denominator is positive at < = 0, when j'{b,Q,a) = akb^^^ independently of the 
heuristic. We then have to cases: 

Case 1 The denominator remains positive at all times, in which case dac(t)/dt is always negative 
and ac(t) is a decreasing function of t, which implies that tc{a) is a decreasing function of a; 



Case 2 If daj'{b, t, a) vanishes for some value of t (for a given a), the denominator in (|4.64p vanishes. 
Then datcipi) = and tcia) has either an extremum or an inflection point. After that, the curve 
will continue (with decreasing values of a). The curve of tc(a) cannot reach the axis a = 
(because for a = the formula is surely unclustered, and there is no tc), and neither can it reach 
the t = axis (because at i = we have a pure fc-xORSAT formula, and we know that it has a 
unique clustering transition), so it will end at some terminal point. 

In both cases, tc{a) is a single valued function of a. It is the numerator of (j4.64p . not the denominator, 
which should change sign in order for tc{a) to take multiple values. But this cannot happen because 
of (j4.54p . An illustration of the possible shapes of the curves for t^{a) is given in Figure [4?3l 

Notice that, even though we considered initially the possibility that the trajectory cross several 
times Ec, and defined as the time of the first crossing, the argument I just exposed shows that there 
can be at most one crossing. We shall see that this fact has profound implications for the performance 
of poissonian heuristics. Before doing that, however, let me derive an analogous argument for ts{a). 

We start by taking the total time derivative of the potential, 

^V{b, t, a) = V\b, t,a)f + V{b, t, a) + 5„y(6, a)^ . (4.65) 
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Figure 4.3: Possible shapes for the curves tx{a) ('x' being 'c' or 's'). tx is a strictly decreasing 
function of a if the denominator in (|4.64p or (|4.72p never vanishes (middle full curve). If instead it 
does vanish and then changes sign, tx will develop a maximum and then continue to the left with 
positive derivative, but it will remain a single- valued function of a (bottom full curve). What cannot 
occur (top dashed curve) is that t'^ia) diverges and then changes sign, making a multiple- valued 
function of a: this would require the numerator in (|4.64p or (|4.72p to become negative, which cannot 
occur because of the bounds (|4.52p and (|4.54p . In Section I shall prove that actually the curve 
representing t^ia) must end at a point where its derivative is infinite, as in the case of the middle full 
curve. 



At the SAt/unsat transition, 6 = 6*(i), a = as{t) and V ^ V = from (|4.3ip and ()4.33p . so that we 
obtain: 



= 



V{b,t,a) + daV{b,t,a) 



da 



(4.66) 



6=b*(t),a=Q,(t) 



from which 



das 
~dt 



V{b,t,a) 



daV{b,t,a) 



(4.67) 



fc=6*(t),a=Qs(t) 

We can now substitute (|4.60p in the partial time derivative of the potential (|4.45p to obtain: 



1/(6, i, a) 



1 



1 - 1 
1 

1-t 



^{b,t, a) + 



l{b,t,a) 



1 -t 



At the SAt/unsat transition we have 

Vib:,t,as) = => 

V'{bt,t,as)^0 ^ 
so that (|4.68p reduces to 

V{K,t,as 



YZrj^ (6, t, a) ~ (f>{b, t, a) + ^ _ ^ 

2iKili^ = bt + ii-bt)iogii-bt) 

1 



1-t 



By substituting this in the numerator of (j4.67p we obtain: 

das{t) b — (j){b, t, a) 



dt 



da^{b,t,a) 



(4.68) 

(4.69) 
(4.70) 

(4.71) 
(4.72) 



b=b;{t),a=a,{t) 



4.3. TRAJECTORIES GENERATED BY POISSONIAN HEURISTICS 



71 



The argument now goes as for adt): the bound (|4.52p ensures that the numerator is non- negative, 
and the denominator is positive at t = 0, so that ts{a) must be single valued. 

To summarize, in this paragraph I have shown that the trajectories described by poissonian heuris- 
tics can cross the clustering transition surface Sc and the SAt/unsat transition surface Sg only once. 
Moreover, it is clear that if they reach the contradiction surface Eq the algorithm stops, and the 
crossing of Eq must also be unique. 

4.3.3 Analysis of UC and GUC 

In this paragraph I shall give some examples of the results of the previous paragraph based on two 
poissonian heuristics that are particularly simple to analyze: UC and GUC. 

Analysis of UC 

The equations of motion for UC are obtained from ()4.38p and (|4.40p : 



dcj _ (j + l)cj+i - jcj 



dt 1-t 

with the initial condition Cj{0) = aSj,k. The solution is straightforward 



U > 2) (4.73) 



\t)=a(^^^il-t)H''-^ ij>2). (4.74) 



As usual pi ~ 2c^'-'/ (1 — t) and for all j > 1 the corresponding pj = 0. This is a direct consequence of 
(|4.38p : the requirement for Unit Propagation is that the above expression of pi be true, and if there 
are no unit clauses pa = 1 and all the other p^'s are 0. 
We can explicitly compute 7, V and (fi: 

k 

-l^^{b,t,a) = ^cy^(^)6J=a[^ + 6(l-t)]''-afc(l-^)^'=-l6-a^^ (4.75) 

= akt''-^ -ak[t^b{\-t)f^^ +h+{l-b)\og{l-b), (4.76) 

(j)^'^{b,t,a) = ^pj{t)b> ^ pi{t)h^ ^ ^ ' h = ak{k - - t)t''-H . (4.77) 
i=i 

An example of the potential V^^{b^ t, a) for A: = 3 is plotted as a function of b for different values of 
t and for fixed a = 0.8 in Figure |4?T1 

The times at which the trajectories cross Sc and Ss are obtained by solving (numerically) for 6 
and t with fixed a the equations {(^"0' = q) A (U^^" = O)} and {{V^'^' = O) A (U^c = q)} 
(respectively). The bounds (|4.52p and (|4.54p obviously hold, since (f)^'^ is simply pib and pi < 1, 
with the equal sign on the contradiction surface Sq. Moreover, the denominator in (|4.64p is daj' = 
k{[t + b{l — t)]''~^ — t''~^} > 0, and the denominator in (|4.72p is da'y which is also strictly positive. This 
ensures that tc{a) and ts{a) arc strictly decreasing functions of a. The time at which contradictions 
are generated with probability 1 is obtained by solving 2cy-'(i)/(l — i) = 1 for t at fixed a. The plots 
of tc{a), ts{a) and iq(a) are shown in Figure [i^ 

The largest value of a for which the algorithm finds a solution with finite probability (which I 
shall denote Q;^'~^(fc), the 'h' standing for 'heuristic') is the smallest value of a for which the trajectory 
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Figure 4.4: Times of crossing of Ec, Eg and Eq for fc = 3 for UC and GUC. For a = etc ~ 0.818 the 
initial formula is at the clustering transition and tc = for both heuristics. The same happens with 
the SAt/unsat transition at a = as — 0.918. As expected, tc{a) and ts{a) are single- valued. The fact 
that they are strictly decreasing means that for UC and GUC the denominators of (|4.64p and (|4.72p 
never change sign. 



crosses the SAt/unsat transition surface Eg. Alternatively, it can be computed as the smallest value 
of a for which the equation 2c2"~^(i)/(l — t) ~ 1 has a solution, which was done in Section [TH 

For fc = 3 this is equal to 2/3 and for large k it goes as e/fc + 0{k~^). 



Analysis of GUC 

The analysis of GUC is slightly more complicated. The analysis of Scction [2T4l shows that the equations 
of motion arc 



dcj ^ {j + l)cj+i -jcj 
dt 1-t 



-S 



1 (.7 - 



(j>rW) (4.79) 



where j*{t) is the smallest value oi j such that Cj{t) > 0, assuming the initial condition Cj(0) = cwJj.fc. 
The interpretation of these equations is that GUC always assigns a variable appearing in the shortest 
clause (or possibly clauses) in the formula. As long as i*Cj- /(I — t) < l/(j* — 1) the rate at which 
clauses of length j* — 1 are generated is small enough that they can be removed, and the density of 
clauses of length j * — 1 remains 0; when this bound is violated, an extensive number of clauses of length 
j* — 1 accumulates, and Cj*_i becomes positive. I shall call t*{j) the time at which Cj{t) becomes 
positive. When this happens, the value of j* is decreased by 1. The equations (|4.79p therefore hold 
for j > j*{t), while Cj{t) = for all j < fit). 

Even though it is in principle possible to solve ()4.79p exactly for any finite fc, the solution becomes 
more and more complicated as k increases, since it involves matching the solutions of different differ- 
ential equations at fc — 2 points (at least for a large enough that j* reaches 2). I shall only give the 
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example of /c = 3, for which one obtains: 



cGUC(i) = a(l-^)^ (4.80) 



^3 

cr^{t) = \{l - t) {-.a[l - [l - tf] +\og{l - t)] . (4.81) 
Notice that from (|4.79p it is clear that the Pj{t) are all except for two of them: 

Pr-i = f^. (4.83) 

For a fixed value of j* = j, t varies between t*(j) and t*{j — 1), and during this interval of time, 
jCj{t)/{\ — t) varies between and l/(j — 1), so that we have: 

^ <Pr(t) + Pr{t)-i< 1 - (4-84) 



It is easy to see that the bound expressed in (|4.52p is respected (actually the previous inequality is 
even more stringent) and that the bound in (|4.50p is respected but saturated: J^jJPji'^) = 1 at all 
times. 

The crossing times of Ec, Sg and Eq are computed solving numerically the equations obtained 
from the conditions (|4.3ip . (|4.32p and (|4.33p . as for UC. The results are shown for /c = 3 in Figure l4^ 

The largest value of a for which GUC succeeds with positive probability in finding a solution, 
a^^'^{k), can be found by looking for the value of a for which maxjg[o,i] 2c^^^/ {l — t) = 1. For fc = 3 
this gives the equation 6a — log(6a) = 3, so that a^^'~ {3) ~ 0.750874. Notice that this is larger than 
a)l'^{3), as could be expected. 

4.4 Bounds on the values of a for which poissonian heuristics 
can succeed 

I shall now discuss how the results of the previous Section on the general properties of poissonian 
heuristics are related to the phase diagram of fc-xORSAT, and in particular what consequences this 
relation has on the performance of poissonian heuristics in the various phases. 

At the end of Section 14.21 1 have shown that the surfaces Sc and intersect each other (I called 
the intersection critical surface S^) and that Sc, Sg and Ek are tangent to each other and to the 
contradiction surface Eq. This a property of the phase diagram of fc-xORSAT which has nothing to do 
with specific DPLL heuristics. However, a continuity argument based on the fact that the trajectories 
generated by poissonian heuristics can cross the surfaces Ec and Eg at most once confirms it. The 
argument goes as follow. 

For any heuristic of the poissonian class, there is a threshold a\i{k) below which the heuristic finds 
a solution with positive probability and above which this probability vanishes. The heuristic fails with 
probability 1 if the (average) trajectory intersects the contradiction surface Eq. Since for a < the 
trajectory must not intersect Eq while for a > it must, by continuity (of the trajectory and its 
derivatives, and of Eq and its derivatives) this implies that the trajectory corresponding to ah must 
be tangent to Eq. 

In the same manner, since the trajectories can cross Eg at most once, if a trajectory enters the 
UNSATphase, it cannot escape from it, and the algorithm must fail. This means that for a < ah the 
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trajectories must not cross Eg, while for a > ah they must. As before, by continuity this imphes that 
the trajectory corresponding to ah must be tangent to Eg. The same argument can be made to show 
that it is also tangent to Ec. 

Finally, since Ec, Eg and Eq intersect on the critical surface Ek and the trajectory corresponding 
to ah must be tangent to all of them, without crossing any of them, this means that the trajectory 
must be tangent to each of them on the critical surface E^. Therefore, Ec, Eg and Eq are tangent to 
each other on Ek- 

Indeed, it is very simple to see that this argument is correct. The point of a trajectory generated 
by a poissonian heuristic which is closer to the contradiction surface Eq will verify the stationarity 
condition 

d 2c2[t) _ 2c2{t) 2c2(t) 



dt i-t i-t {i-ty 

which, together with the equations of motion (|4.40p gives 



= (4.85) 



dC2{t) 3c3(t)-2c2(t) C2{t) 

= P^(t) = -T^f (4-86) 

The critical trajectory (i.e. the trajectory corresponding to ah) will be such that the value of 
2c2(i)/(l — t) at the maximum is 1. When this happens, pi{t) = 1 so we must have P2it) ~ 
(the heuristic only performs Unit Propagations), and we obtain 

which, together with 2c2{t)/{l — t) = 1 is the equation of the critical surface Ek given in ()4.36p . As 
2c2(t)/(l — t) is maximum in the point of intersection, the trajectory must be tangent to it. 

This has a direct implication for the shape of the curves representing tc{a), ts{a) and iq(a): since 
each of these curves ends for the value of (a, t) that corresponds to the point where the trajectory is 
tangent to Ek, the three curves must end in the same point (which I shall call critical point) in the 
(a,t) plane, and they must be tangent to each other in the critical point. Since at the critical point 
the trajectory is on the contradiction surface, so that pi = 2c2/(l — t) = 1, from (|4.72p it is clear 
that dts/da diverges at the critical point, and since the three curves are tangent, they all have infinite 
derivative. This is clearly seen in Figure 14.41 for UC and GUC with fc = 3. The value of a of the 
critical point is the largest value for which the heuristic succeeds with positive probability, i.e. ah(fc). 

We can now derive the main result of this Chapter, which follows in a straightforward manner 
from the previous discussion. The curve representing tc{a) starts at the point (ac(A;),0) and ends at 
the point (ah(fc),ik)- Moreover, tc{a) is a single valued function of a, and its derivative is negative 
at a = ac{k). This implies that 

ah(fc) < ac(fc) (4.88) 

i.e. that poissonian heuristics fail with probability 1 in the clustered phase. 

This result is, as far as I know, the first that relates the performance of a class of heuristics for 
DPLL with the properties of the phase diagram of the optimization problem. 



4.5 Optimality of GUC for large k 

The result of the previous Section states that no poissonian heuristic for DPLL can succeed with 
positive probability in the clustered phase, i.e. for a > adk). It is then natural to ask what is the 
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maximum value of a which can actually be attained, and which heuristic reaches it, that is to say, 
what is the optimal heuristic. 

It is clear that the optimal heuristic will be the one which minimizes 



Ackh = etc ^ cth ~ / dt' — ^ = — / dt' 







dt' Jq da^'{b,t,a) 



(4.89) 

b=b^(t'),a=aa{t') 



where I used (j4.72p and where ty^ is the time coordinate of the critical point in the (a, t) plane, which 
will depend on the heuristic. Finding the optimal heuristic is a very difficult task: on one hand, the 
functions (t>'{b, t, a), 7' (6, t, a), b*{t) and adt) have a highly non-trivial dependence on the parameters 
which characterize the heuristic, i.e. the probability functions {pj(Ci, . . . , Cfc)}; on the other hand, 
the quantity which must be minimized is an integral, which requires a functional optimization. 

I shall therefore discuss two more accessible results: first, that for finite k GUC locally minimizes the 
numerator of ([4.89^ : and second, that in the limit k 00 GUC indeed is optimal, i.e. ah(fc) — * o:c{k). 

The first statement needs clarification: by locally optimize, I mean that on each point of the 
trajectory described by GUC, it minimizes the numerator in (|4.89p . This is a much weaker requirement 
than optimality, because a different trajectory, which is sub-optimal in some points, might turn out 
to be much better in some other points, and overall be better than GUC. And of course also because 
the denominator should be considered as well. However, I think this result is interesting because it 
sheds some light on why it is impossible for poissonian heuristics to penetrate the clustered phase. 

Indeed, from the definition of (/), which gives 

k 

cP'{b,t,a)=Y,JPdtW-\ (4.90) 
i=i 

and from the bound 

k 

J2jP,{t) = 4>'{l,t,a)<l (4.91) 

it is clear that </>' will be maximized (and hence l — cf)' will be minimized) by taking "the largest possible 
Pj for the smallest possible j" . This means that a heuristic which tries to minimize the numerator 
in the integrand that gives Aa should always select the variables to assign in the shortest available 
clauses, and this is exactly what GUC does. 

Moreover, I already noted at the end of Section l¥75] that GUC saturates the bound (|4.9ip . This 
implies that GUC achieves the largest possible value of J2j Pj > which is the rate at which clauses are 
eliminated from the formula. Since this only happens through Unit Propagations, it also means that 
GUC achieves the highest possible rate of Unit Propagations per variables assigned, and therefore 
minimizes the fraction of variables that are assigned random values. I think this argument makes it 
at least plausible that GUC is actually the best poissonian heuristic. 

A much stronger argument can be made to support the claim that GUC indeed is optimal in the 
limit k ^ 00. From (|4.48p and (|4.84p we have, integrating dt: 

This integral over dt is equal to a sum over the values of j between k and j*{t), 

' t*U)-t*{j + i) I, , ^ t*{j)-t*{j + i) 

a- > : <-l{l-,t,a)<a- > (4.93 

^-^ 7 — 1 1 
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since j*{t) is a step-like function, which is a constant j*{t) = j for t*{j) < t < t*{j — 1). 
It is reasonable to assume that, in the large k limit, 

t*U)-t*{3 + l) = l+oik-') (4.94) 

for most values of i.e. for j such that < j /k < 1. This assumption is well supported by numerical 
data for k in the range 2^ to 2^^, as we shall see later. 
Under this assumption, we obtain 

-7(l,t,a) = a-i J2 -■ (4.95) 

3=3' it) ■' 

In order for the algorithm to generate a contradiction with probability 1, we must have 2c2/(l — i) > 1, 
and to have C2 > 0, j*{t) must reach 2. So if j* always remains larger than 2, the algorithm must have 
a finite probability to succeed. If it does indeed succeed, it stops when 7(1, a) = 0, since 7(1, a) is 
the number of clauses in the formula at time t. The smallest value of a for which the algorithm fails 
with probability 1 is therefore such that 

= a°"^(fc)-i^i = aJr"^(fc)-i^^i^i±^ (forfc^oo) (4.96) 
3=2 ^ 

where the term 0(1) in the numerator comes from the fact that it is possible that for a number of 
terms of order o{k) the asymptotic expansion (|4.94p doesn't hold. We obtain: 

a^uc(fc) = i^ + 0(A:-i) (for ^ 00) . (4.97) 

k 

This is the same scaling that is found for ac(fc) (see Table ISTTj) . so that to the leading order in k 

a^^*^ {k) ^ a^{k) (for/s^cx)). (4.98) 

Let us now turn to the assumption (j4.94p . In order to verify it, we have performed a series of 
numerical simulations, in which the equations of motion of GUC are integrated by finite differences, 
for values of k equal to the powers of two between 2^ and 2^^. A finite-size scaling (with respect to 
k) of the results, shown in Figure 14.51 is consistent with the scaling 

k r ij) ~ t* (j + 1)] = 1 + fc"^ X f{j/k) (4.99) 

where f{x) is a function independent on k and which goes as for a; — > 0. The values of /i and v 
are found to be both equal to 1/2. Integrating the scaling form (|4.99p with ji = v = 1/2 one obtains 
that the first correction to the leading term logfc/fc in a{^^*-^(fc) is of order 1/fc, in agreement with the 
numerical estimates of a'^^'~'{k) which give a'^^''^{k) ~ \ogk/k + 2.15/fc. 

I believe that the above numerical results make a strong case supporting the assumption (|4.94p . 
and therefore the optimality of GUC. 



4.6 Conclusions and perspectives 

In this Chapter, I have discussed some very general bounds on the performance of poissonian heuristics 
for DPLL for the solution of fc-xORSAT formulae and for that of its NP-complete extensions, called 
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Figure 4.5: Finite size scaling results for GUC at large k. Top Left Each curve shows the values 
of k[t*{j) — t*{j + 1)] as a function of j/k for k = 2^, 2^, ... , 2^^ (from the farthest to the closest 
curve to 1), and was obtained by integrating the equations of motion (|4.79p by finite differences. 
For each k, the value of a used is a^^'~^{k), determined as the value of a for which the maximum 
reached by 2c2(i)/(l — t) is 1. Top Right Data points of a^^'~^{k) versus logk/k + 2.15/k (red hue). 
Bottom left The same data as above, plotted as {k x [t*{j) -~t*{j + 1)]} x fc-^/^. The curves "collapse", 
showing f{x) and confirming the value of = 1/2. Bottom right By plotting the same curves on 
logarithmic scale it is easily seen that for x close to f{x) ~ a;^^ with = 1/2, corresponding to the 
slope of the red line. 
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(d, fc)-UE-CSP. In particular, I have proved that such heuristics generate contradictions (i.e. fail) with 
probability 1 in the clustered phase of the problem. 

A point of caution should be placed in the interpretation of this result: it is a very peculiar feature 
of fc-xORSAT that the clustering and freezing transitions coincide. What is found in general in other 
problems is that the clustering transition, where solutions form an exponential number of connected 
clusters that are well separated, and the freezing transition, where some variables take a constant 
value in all the solutions of a given cluster, are distinct. It is well known that in problems where these 
thresholds are distinct, it is the freezing transition that corresponds to the onset of hardness for known 
local algorithms. It can be argued that in fc-xORSAT too, what causes DPLL poissonian heuristics to 
fail, is the strong correlations between variables that are present in the frozen phase, rather than the 
separation of the clusters. In view of this, it would be very interesting to understand what similar 
bounds could be obtained in problems where the two thresholds are distinct, and notably in fc-SAT. 

Another interesting question concerns the extention to more general, non-poissonian heuristics. In 
this regard, I have obtained some partial results that seem promising, even though a general theory is 
still far. More specifically, I have been able to solve the leaf removal equations for the case in which the 
mixed system to which it is applied is not poissonian, but instead is characterized by some arbitrary 
distribution of the number of occurrences. However, due to the complicated structure of the solution, 
it has resulted impossible so far to characterize the phase transitions in terms of a potential, which 
then would allow to derive some general properties of the trajectories, and possibly some bounds on 
the values of a for which solutions can be found. Some further work in this direction seems worth 
undertaking. 



Chapter 5 



Characterization of the solutions of 
/c-SAT at large a 

In this Chapter I shall discuss the properties of the solutions of random fc-SAT at large a. This might 
seem oxymoronic, since at large a random /c-SAT formulae are unsat with probability 1. The idea is 
precisely to restrict the formulae that are considered to those that, for a given large a, are SAT, then to 
form an ensemble of these formulae with uniform weight, and study the properties of their solutions. 

Apart from the intrinsic interest of the question, i.e. studying the properties of this particular 
ensemble of A:-SAT formulae, this problem is relevant because of some recent results by Feige and 
collaborators [THITS]: for the first time (as far as I know), they have been able to relate the average case 
complexity of a satisfiability problem with the worst case complexity of another class of problems, thus 
bridging the gap between complexity theory and results derived from statistical mechanics methods. 

Feige's result can be summarized as follows: under the assumption that there is no polynomial- 
time algorithm capable of recognizing every SAT instance (and most UNSAT instances) of 3-SAT for 
arbitrarily large (but bounded in N) values of a, the approximation problem to several optimization 
problems (including min bisection, dense fc-subgraph and max bipartite clique) is hard, i.e. non- 
polynomial in time in the worst case. The complexity class of the approximation problems considered 
by Feige was previously not known. 

With this motivation, Remi Monasson, Francesco Zamponi and I have studied in |76| the problem 
of characterizing the solutions of 3-SAT at large a, with the objective of showing that a simple message- 
passing procedure is able to contradict a probabilistic version of Feige's assumption, in which "every" 
is substituted with "with probability p" , for any (finite) value of p. 

In the following Sections, I shall therefore present more in detail Feige's result and define the prob- 
lem (Section 15. ip ; then I shall present the computation of the free energy of the uniform distribution 
of satisfiable 3-SAT formulae, in Section [^21 in Section [?751 a similar result is derived from the cavity 
formalism; then, in Section 15.41 1 shall compare the results obtained with those that are valid for a 
different ensemble of formulae, which was studied by Feige, and draw their algorithmic implications; I 
shall then comment, in Section [5751 on the stability of the RS solution of Sections 15.21 and 15.3) finally, 
in Section 15.61 1 shall present and discuss the conclusions of this work. 
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5.1 Problem definition and previously established results 

I shall now define the problem I want to study, and give a brief overview of Feige's results, concerning on 
one hand the relation between the average-case complexity of 3-SAT and the worst-case complexity of a 
class of approximation problems, and on the other hand the properties of a very simple message-passing 
algorithm, which on a particular ensemble of satisfiable 3-SAT formulas has interesting properties (in 
view of the previous complexity result). 

5.1.1 Definition of the random ensembles 

Let us consider random 3-SAT formulae J- involving N boolean variables {xi, . . . , xn} and M = aN 
clauses, with finite a (as N oo). I shall denote assignments of the N variables a.s X = {xi\i = 
1, . . . , A^} G {true, false}^. Alternatively, I shall represent them as configurations of N Ising spins 
(7i £ 1}, collectively denoted by cr = {ai\i = 1, . . . ,N}, with Ci = 1 corresponding to Xi ~ true 
and -1 to FALSE. 

The Uniform Ensemble Punif [J'] is obtained by giving the same weight to each possible formula J-. 
When a > as{3) ~ 4.267, the probability over Punif [-^l that a formula is SAT is 0: the overwhelming 
majority of formulae are unsat. It is therefore interesting to introduce two particular ensembles that 
include only those formulae that are SAT: 

Satisfiable Ensemble T'sat is the ensemble of satisfiable formulas, with uniform weight. This is 
simply the restriction of "Punif to satisfiable formulae. 

Planted Ensemble Given an assignment X, the ensemble Vpi^^ti^] of SAT formulae "planted on X" 
is defined as the uniform ensemble of formulae that admit X as a solution. The Planted Ensemble 
'Ppinnti^] is obtained by averaging over X with uniform weight for all possible configurations. 

Notice that any satisfiable formula is present in both ensembles, but with different weights, as is 
easily seen from a simple computation: for each clause involving k literals, there is only one assignment 
of the corresponding k variables that is not SAT. The number of formulae TVf [X] that admit X as a 
solution is therefore 



where A/"s [.?-"] is the number of solutions admitted by T. It is then clear that 7-'piant[-^] is not uniform, 
but proportional to the number of solutions of J-. 

As we shall see in the following paragraphs, the two ensembles 'Psat and Vpunt appear in Feige's 
results. 

5.1.2 Hardness of approximation results 

In this paragraph I shall give a very brief (and non-rigorous) overview of a theorem proved by Feige 
in [71]. 

Feige considers a class of algorithms that take a 3-SAT formula as an input and have two possible 
outputs: either SAT or UNSAT. The algorithms in question need not be deterministic: for a given 




(5.1) 



which is independent on X . The Planted Ensemble is then by definition 




1 



I[T is satisfied by X] _ J\fs[T] 



(5.2) 
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formula, it is admissible that the output be a random variable, whose distribution will then depend 
on the formula. Notice that, since there are two incompatible outputs, algorithms of this kind can 
give a wrong answer. However, we shall consider only asymmetric algorithms, i.e. such that if the 
input formula is SAT then the output is always SAT; on the other hand, it is admissible that if the 
input formula is UNSAT the output be SAT, and we shall only require that the probability of this error 
be smaller than 1/2 (or some other finite constant, the actual value of which is unimportant). 
He then examines the following 

Hypothesis 1 Even when a is an arbitrarily large constant (independent on iV), there is no poly- 
nomial time algorithm that refutes most Random-3-SAT formulae and never wrongly refutes a 
satisfiable formula. 

This hypothesis states that no algorithm of the class described above can work in polynomial time 
on average for 3-SAT formulae drawn from the Uniform Ensemble T'unif- In the statement of this 
Hypothesis, the crucial word never refers both to the choice of the formula and to the random moves 
of the algorithm. According to the author, no algorithms are known to contradict it. Notice that 
numerical experiments demonstrate that as a grows beyond the SAt/unsat transition threshold, k-SAT 
becomes more "easy" (i.e. the average running time for refutation decreases). However, all known 
algorithms remain exponential time, and it is only the prefactor of the exponent which decreases. 
Therefore this observation docs not contradict Hypothesis 1. Also, notice that the fact that a is a 
constant independent of is crucial: polynomial time algorithms are known for a 3> N^^'^ . 

In his paper Fcigc also considers a weaker form of this hypothesis, which has several advantages. 
The motivation for it is the following. For large a, not only typical random formulae are unsat, 
but the number of violated constraints becomes concentrated (relative to the Uniform Ensemble of 
formulae) around A//8, for every assignment. Therefore, the formulae that are not typical include 
all satisfiable formulae, and also all the formulae that admit at least one assignment which violates a 
number of clauses eM with < e < 1/8. 

Hypothesis 2 For every fixed e > 0, even when a is an arbitrarily large constant (independent on N), 
there is no polynomial time algorithm that on most Random-3-SAT formulae outputs typical 
and never outputs typical on formulae with (1 — e)M satisfiable clauses. 

In this case the algorithm considered has two possible outputs, typical and not typical, and again 
the admissible error is asymmetric. For e = Hypothesis 2 reduces to Hypothesis 1. 

Notice that, despite the appearence. Hypothesis 1 imphes Hypothesis 2 and therefore Hypothesis 2 
is weaker than Hypothesis 1. In order to realize it, let me show that if Hypothesis 2 is violated, then 
Hypothesis 1 is also violated. Indeed, if Hypothesis 2 is violated, an algorithm exists which is able 
to identify formulae that have a fraction of satisfiable clauses larger than 1 — 1/8. In most cases the 
output of this algorithm will be typical, meaning that the fraction of satisfiable clauses is 1 — 1/8; 
however, if the formula has a fraction of satisfiable clauses larger than 1 — 1/8, it will be identified as 
such. Therefore, such an algorithm will output TYPICAL most of the time, but it will never output 
TYPICAL if the formula is SAT (and therefore has a fraction of satisfiable clauses larger than 1 — 1/8), 
thus contradicting Hypothesis 1. 

The main result from [74 is the following 

Theorem 1 The existence of an algorithm able to approximate in polynomial time the solution to 
any of the following problems would contradict Hypothesis 2: min bisection, dense A:-subgraph, 
max bipartite clique (all within a constant approximation factor) and 2-catalog (within a factor 
where N is the number of edges and < i5 < 1 some constant). 
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I shall not define these problems, which are well known in theoretical computer science and of little 
interest for the following^. It suffices me to say that their complexity class is not known. If Hypothesis 2 
were proved to be true, as a consequence all these problems would be NP-hard, and this would be an 
interesting new result. 

As I already mentioned, this theorem establishes a relation between the average-case complexity 
of 3-SAT at large a and the worst-case complexity of some other problems. In this regard, it is a very 
striking result, and it opens the possibility of applying statistical mechanics methods to complexity 
theory. 

Without any ambition to rigor, let me just sketch the proof of the theorem, which is rather 
interesting. Let us define a problem P as R-3-SAT-hard if the existence of a polynomial time algorithm 
to solve P would contradict Hypothesis 2. In particular, a problem is R-3-SAT-hard if it is possible 
to reduce any instance of 3-SAT to an instance of P to which A can be applied, in such a way as 
to contradict Hypothesis 2. Then, Feige proves that several other boolean constraint satisfaction 
problems, and their optimization versions, arc R-3-SAT-hard. 

More specifically, let us consider a boolean function over three variables, / : {true, false}"^ — > 
{true, false}. The number of such functions is 2^"', most of which coincide up to renaming or 
negation of the variables. For each of them, let us define as t the number of possible inputs, out of 2^, 
for which / evaluates to TRUE, and b (for bias) the number of possible inputs with an odd number of 
TRUE values and for which / evaluates to true (or, if it is larger, the same quantity with even instead 
of odd). Then, there are 13 distinct such functions for which 26 > t, including and, OR and XOR. 

Consider a "3/-clause" involving 3 literals over N variables and based on any of these 13 functions 
/, and a random "3/-formula" made of M ~ aN such clauses. Feige proves the following 

Theorem 2 It is R-3-SAT-hard to distinguish between those random 3/-formulae in which a fraction 
just over t/8 of the clauses are satisfied, and those in which this fraction is just below b/A (as- 
suming a is sufficiently large). In particular, this implies that it is R-3-SAT-hard to approximate 
MAX-3/ within a constant factor better than t/2b. 

This theorem is very interesting in itself: it is here that the link between the complexity of a decision 
problem (namely R-3-SAt) and that of an approximation problem is established (even though, only 
for the average case). The proof of Theorem 2 is straightforward but complicated, and I shall omit it. 
Feige then proves the following 

Proposition For every e > 0, there is an such that for any a > c^e and N large enough, whith 
probability 1 the following holds: every set of (1/8 + e)M clauses in a R-3-SAT formula with 
M = aN clauses contains at least + 1 distinct literals. 

The crucial point, which will allow to establish a link between average-case and worst-case complexity, 
is that the proposition holds, with probability 1 over the choice of the 3-SAT formula, for every set 
of (1/8 -|- e)M clauses of a given formula. The proof of this proposition is rather simple: given N 
variables, corresponding to 2A^ literals, let us select a set S containing N literals. The probability 
that a random clause contains no literal from S is (1/2)^^, and the probability that m clauses out of 
M contain no literals from S is 



ill — ;. 



(5.3) 



-"^A definition is given in [71| 
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which, for large M and m — fiM, is asymptotically 

Ps{m) ^ exp {Af log2 [-/ilog2 - (1 - m) log2(l - m) - 3/i + (1 - m) logs (7/8)]} = e""'^^^^ 
This probability is maximum for /i = 1/8, and verifies the large deviations relation 

r 1 1 r 

P fi = - + e -exp a7V(/)"(l/8) — 
8 2 

with (A"(l/8) = -64/7. 

Therefore, for any given e > 0, provided a > — 3 x 2/ [0"(l/8)e^] , we shall have 



(5.4) 



(5.5) 



1 



A* = 



< 2 



-3N 



> 1 - 2 



-3N 



More explicitly: 

P [at least (1/8 + e)M clauses out of M contain no literal from S] < 2"^^ . 
We can now use Boole's inequality, 



< 



(5.6) 



(5.7) 



(5.8) 



and write, for all the possible subsets S" of literals out of 2N, 

F at least (1/8 + e)M clauses out of M contain no hteral from |J < X! ^"^^ ^^'^^ 

s 

•iF^ P [at least (1/8 + e)M clauses out of M contain no hteral from any set of N hterals] < 2^^ 

(5.10) 

since the number of possible sets S is less than 2^^. This statement is equivalent to the one in the 
Proposition: every set of (1/8 + e)M clauses contains at least + 1 literals, with probability 1 over 
the choice of the formula from which the clauses are taken. 

The proof of Theorem 1 then proceeds as follows, for each of the graph-based problems P listed 
in the enunciate. Some 3AND-formula J- with M = aN clauses in N variables is mapped to a graph 
C/ by a suitable construction. The actual constructions vary with the specific problem P and I shall 
omit them. Let us make the case of min bisection for concreteness. The Proposition is used to prove 
that if has at most (1/8 + e)M satisfiable clauses, then the corresponding Q has a cut of width 
at least (1 — e)M; while if has at least (1/4 — e)M satisfiable clauses, then the corresponding Q 
has a cut of width 3(1/4 + e)AI. This means that if it is possible to approximate min bisection on 
every instance within a factor 3/4, then it is possible to compute the approximate bisection, and from 
the approximate value it will be possible to distinguish the two cases (i.e. of typical 3AND-formula2 
with (l/8 + e)Af satisfiable clauses vs. non-typical 3AND-formula3 with (1/4 — e) A/ satisfiable clauses). 
Because of Theorem 2, this contradicts Hypothesis 2, and thus proves Theorem 1. 



5.1.3 Performance of Warning Propagation on the planted distribution 

The problem of establishing or refuting Hypothesis 1 and/or Hypothesis 2 was tackled by Feige and 
collaborators in [75j . That paper makes a step forwards in the direction of refutation, but does not 
achieve to prove it in general. 
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The authors consider a simple message passing procedure, cahcd Warning Propagation (WP). 
Given a 3-SAT formula T and the factor graph Q representing it, two kinds of messages are defined 
for each edge in C?, i.e. for each pair {Ca,Xi) where Ca is a clause and Xi a variable appearing in 
it: clause-to-variable messages Ua^i arc binary variables equal to or 1; variable-to-clause messages 
hi^a are integer variables (positive, negative or null). The following update rule is defined: 

b£d+i\a b^d-i\a 

(5.11) 

jeda\i 

where da is the set of variables appearing in clause Cq, the back-slash denotes privation, d+i is the set 
of clauses in which variable Xi appears non- negated, and d^i is the set of clauses in which it appears 
negated. 

WP is defined as the following algorithm, taking a 3-SAT formula J- as input and returning a partial 
assignment X as output: 
procedure Warning Propagation (JT) 

Construct the factor graph Q representing J- 

Randomly initialize the clause-to- variable messages {ua—>i} to or 1 
repeat 

Randomly order the edges of Q 

Update the messages hi^a and Ua^i in the selected order according to the rule (|5.1ip 
until No message changes in the update 
Compute a partial assignment X based on {hi^a}'- 

Xi = TRUE 

else if T,aea+t ^i^a - T^aaO-i ^i^a < then 

Xi = FALSE 

else 

Xi is unassigned 
end if 
Return X 
end procedure 

Notice that some variables in X will be unassigned at the end of WP. 
The main result proved in |75| is the following 

Theorem 2 For any assignment Y and any formula T from the ensemble Ppiantl-^] planted on Y 
with large enough a (but constant in N), the following is true with probability 1 — e^'^'-"-' over 
the choice of the formula and the random moves of WP: 

1. WP(JF) converges after at most 0(log A^) iterations 

2. The fraction of variables assigned in X is 1 — e~'-"-°'\ and for each of them Xi — yi (the 
value it takes in the planted assignment Y) 

3. The formula obtained by simplifying with the values assigned in X can be satisfied in 
time 0{N) 
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5.1.4 Discussion of the known results and problem definition 

Theorem 2 establishes that WP has some properties of the algorithm described in Hypothesis 1 , but 
with some important differences, as I shall discuss in this paragraph. 

First, WP is a constructive algorithm, but it is not complete: it is possible that it never converges 
(i.e. that the loop goes on for ever); however, if it does converge, it provides an assignment which can 
be easily checked. One can set a fixed maximum number of iterations A/j and stop the execution if 
it is reached; the output will then be unsat, and this will possibly be wrong. If, on the contrary, an 
assignment is returned (and it is checked to be satisfying), the output will be SAT, and this will surely 
be true. 

Therefore WP is an asymmetric algorithm, which never outputs SAT to an unsat formula, but 
which sometimes outputs unsat to a SAT formula. The algorithm described in Hypothesis 1 is different 
in this regard, as it must never return UNSAT to a SAT formula. 

Second, the statements in Theorem 2 hold in probability for formulae drawn from the Planted 
Ensemble, while in Hypothesis 1 the Uniform Ensemble is considered. 

The conclusion which can be drawn is that Theorem 2 refutes the following modified 

Hypothesis Ip Planted Even when a is an arbitrarily large constant (independent on N), there 
is no polynomial time algorithm that refutes most Random-3-SAT formulae from the Planted 
Ensemble T^piant, and outputs SAT with probability p on a 3-SAT formula which is satisfiablc. 

The differences relative to Hypothesis 1 are written in italics in Hypothesis Ip Planted: the distribution 
of formulae is the Planted Ensemble instead of the Uniform one, and satisfiable formulae are recognized 
with probability p instead of always. 

The question I shall try to answer in the rest of this Chapter is if it is possible to make further 
progress towards the refutation of Hypothesis 1 , and in particular if the convergence of WP can be 
established for formulae drawn from he Satisfiable Ensemble T'sat- This is equivalent to proving it for 
the Uniform Ensemble, since "Psat is the restriction of T'unif to satisfiable formulae, and for formulae 
that are not SAT it is admissible for the algorithm to give wrong answers (i.e. not to converge). 

The main conclusion that we shall reach is that this is indeed true, and that the following 

Hypothesis Ip Even when a is an arbitrarily large constant (independent on N), there is no poly- 
nomial time algorithm that refutes most Random-3-SAT formulae "Ppiant, and outputs SAT with 
probability p on a 3-SAT formula which is satisfiable. 

is wrong for any p < 1. A similarly probabilistic version of Hypothesis 2 will also be refuted. 

5.2 Free energy of the uniform distribution of satisfiable for- 
mulae 

In this Section, I shall apply the replica formalism for diluted systems described in Paragraph II . 4 . 3l 
to a spin glass problem which is equivalent to Random-3-SAT, in order to derive the properties of the 
formulae in T^sat and of their solutions. 

The following computation follows the one presented in [77], the main difference being the intro- 
duction (in Paragraph l5.2.3|) of a "chemical potential" that will permit to select only the formulae that 
are satisfiable. 
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5.2.1 Replicated partition function of A;-sat 

In this Section, we shall use the representation of an assignment X as a collection of N Ising spins, 
a = {(Ti, . . . ,crjv}- For a given fc-SAT formula and a given configuration a we define the energy 
function 

M 

Ejr{cr) = ^I[cr verifies C,] (5.12) 

i=l 

where Ci is the z'^ clause in J^. This energy is simply the number clauses in J- that are violated by a. 
The partition function is defined as 

M 

Z^(/?) = 5:e-'^^-(-)=^n^''(-) (5.13) 

where Zi{a) = exp{— /3 I[cr verifies Ci]}. The average of the replicated partition function over the 
choice of the formula from the Uniform Ensemble "Punif , which I shall denote by an overline, is 



M 

3^ ai,...,(j" 1=1 

where cr" is the iV-componcnt configuration of replica a. 

Since the literals appearing in each clause are extracted independently on the other clauses, the 
average over the choice of the formula reduces to the average over the literals appearing in a clause, 
raised to the power M: 

YW= E [^FF^"^!^]"^ • (5.15) 

ai,...,<T" 

Let us consider a term in the sum, corresponding to a given cr = (cr^, . . . , cr" ). It is the average over 
the choice of the literals appearing in the clause of a product over the replica index a of a quantity 
which is 1 if the clause considered is satisfied by replica a and otherwise. Let us denote by ij the 
index of the j'^ literal in the clause, and by qj a variable which is —1 if it is negated and 1 otherwise. 
We have: 

-(-^)---(-")-( J E ^ E n l+(^"'-l)n^«''?.) (5-16) 

ii<---<ifc qi....,qka=l [ j=l J 

where the (5 is a Kronecker function. In the following we shall consider the limit N ^ oo, and in view 
of this we can neglect the constraint of the k indices being different and approximate the binomial 
with TV''. 

The product over a appearing in (|5.16p is a function of the replicated configurations ffi. at the 
sites {ii, . . . ,ik}- Since we are averaging over the choice of the sites, it is convenient to introduce 



N n 



i—l a—1 



which is the fraction of sites that, for a given cr, are equal to the n-component configuration r. We 
then have 

z((Ti)---z(a")= E Piri\rT)---pifk\<T)£in,...,fk) (5.18) 
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where 



^{ti, ■■■,fk 

The rephcated partition function ()5.15p is then 



-I {-1,1} n ( k 

qi,...,qk a=l j"=l 



<lj) 



(5.19) 



Z(/3)" = ^exp<^ A/log Y. p{riW)---p{Tk\(T)£{Tu...,Tk)) . (5.20) 

We can introduce the function c(t) and muhiply the previous expression by the functional integral 

f 5c{-)5[c{T)-p{f\a)\=l (5.21) 
Jo 

(where the integrand is a functional Dirac distribution), to obtain 

^'(5c(-) expjaiVlog ^ c(ti) • • • c(ffe) f (n, . . . , ffc)| ^ <5[c(-) - p(-k)] . 

(5.22) 



Z(/3)" = 



The sum over cr is the number of n-repUcatcd iV-sites configurations such that for any r, the fraction 
of sites that have a replicated configuration r is equal to c(t). For each of the possible values of f 
one has to choose the Nc{f) sites that will have r as their replicated configuration, and the number 
of ways to do it is the multinomial coefficient: 



E4c(-)-p(-k) 



iV! 



• cxp 



-A^ ^c(f) log c(f) 



(5.23) 



to the leading order as N oo. 

The "physical" interpretation of the previous results is the following: the function c(-) is the order 
parameter of our theory, the term which multiplies N in the exponent of (|5.22p is the effective energy 
expressed in terms of c(-), and ()5.23p is the (microcanonical) entropy of c(-). This follows exactly 
the scheme traced in Paragraph 11.4.31 Notice, moreover, that the "physical" inverse temperature /3 
only appears in the definition of £, which is the effective interaction strength, and that the parameter 
which plays the role of the inverse temperature in the effective theory is a. 

5.2.2 Free energy and replica symmetric ansatz 

We can write (j5.22p in terms of an effective free energy density 

^[c(-), = -^c(t) log c(f) + a log c(fi)---c(rfc)£(fi,...,ffe) (5.24) 



as the functional integral 



Z(/3)» = / (5c(-)e^^[=(-)'"''^'"l 
Jo 



(5.25) 



which in the thermodynamic limit N ^ oo can be evaluated by saddle point. The free energy density, 
defined as 

1 1 1 

(5.26) 



/(/3,a)^- hm -lim -logZ(/3)" 

p N^oo iV ri-^0 n 
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is then equal to the Hniit for n of the extremum value of (|5.24p over c(-), 



f{P,a) ~ lim — extremum ^[c(-), n, /3, a] . 



(5.27) 



Notice however that, as usual with replica calculations, the order of the two limits N ^ oo and n — > 
has been reversed, which has no a priory justification. 

In order to compute the extremum of the effective free energy, some assumption must be made 
on the form of the function c(-). The replica symmetric ansatz considers only functions that are 
symmetric in the replica index, of the form: 



\a=l / 

Under this assumption, a convenient parameterization of the function 7(-) is in terms of the auxiliary 
function R{h) 



A few remarks are in order. First, we expect the function 7(-) to be even, because in (j5.19p we are 
summing over the values of qj, and Z{P)^ is therefore invariant under r —>■ — f. This implies that 
R{h) must also be even. Second, c(r) is normalized to 1 (it is the fraction of sites that have replicated 
configuration f), so also R{h) must be normalized. 



Third, the equation (|5.29p defining R{h) is, apart from the factor in the denominator, a Laplace 
transform, so that R{h) is indeed well defined. Finally, notice that the expression multiplying R{h) in 
()5.29p is the Gibbs weight of a system of n Ising spins r° in a uniform magnetic field h at temperature 
(3. Since the physical intcrprctatiord of c(r) is the fraction of sites having the replicated configuration 
r, that is to say, the probability that the configuration r is observed, the interpretation of h is indeed 
that of a magnetic field acting on the spins, and the interpretation of R{h) is that of the probability 
distribution of the values of the field h. This observation motivates the introduction of R{h). 

5.2.3 Selection of satisfiable formulae by means of a "chemical potential" 

So far, the computation has been performed for any /3, and nothing in it ensures that only satisfiable 
formulae are considered in the average: the ensemble we are considering is "Punif [-^l instead of Psat[.^]- 
I shall now introduce a method which allows to restrict the ensemble to T^sat [J'] ■ 

General strategy 

In all generality, for systems with discrete configurations 't^ and discrete non- negative energy E{'^) € 
{Eq, El, . . . }, the partition function Z{f3) can be written as 




(5.28) 




(5.29) 




(5.30) 




(5.31) 



^This is true for the function c(-) which extremizes the free energy density. The physical interpretation of h and 
R(h) therefore holds only for the R{h) corresponding to the extremum. 
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where gi is the number of configurations with energy Ei. In the presence of disorder, the values of 
{Ei} depend on the sample. The average over disorder of the replicated partition function is then 

zW^[9oe-^^«+gie-P^^+ (5.32) 

In order to compute the free energy of the model, the limit n must be taken, and if one is 
interested in the low temperature behavior of the model, the question of the value of the product 
(3n = V rises. 

Normally, what is needed is the low temperature behavior of the average of the free energy over 
all values of the disorder, and n must go to before sending /3 oo, which corresponds to i/ = 0. 
In our case, however, we would like to select the values of the disorder parameters that minimize the 
energy of the system, and to restrict the average to these values. Let us see what happens when taking 
/3 — > cx) and n — > with finite v. We can formally develop the multinomial in (j5.32p to obtain 

Z(^ = gJ^e-'Eo + n g'^~^e~Pi"~^)E„ [g^e-Pi^^ + ■■■] +■■■ 

= g^e-^^o + n g^'^ gie-"'^'^ g/^C^^o-^^i) + ■ • • (5.33) 

where each term after the first has a factor e^^^°~^^^ which makes it vanish, so that only the first 
term contributes. Since g^ is independent on n, and 77, we remain with 

Z(/?)" ~ e-'^'Eo . (5.34) 

We see that the consequence of taking > is to include in the computation of the replicated partition 
function, for a given realization of the disorder, only the lowest energy configurations. 

The energy Eq is the extensive energy of the ground state of the system for a given realization of 
the disorder, which can be regarded as a random variable over the distribution of disorder. Let us 
denote by CL'(e) the large deviations function of the distribution of the energy density e = Eq/N of the 
ground state, i.e. 

V[Eo = Ne] ^ e^'^(^)+°(^) . (5.35) 

It is reasonable to expect that w(e) will be a negative convex function (i.e. to"{e) < 0), vanishing in 
its maximum. Let's assume that this is the case. For large N we shall have 

Zp)^~ y"dee^["('^)-'"^l ~ e^'^^'') (5.36) 

where 

(p{iy) = max [uj{e) — ve] (5.37) 

e 

is the Legendre transform of a;(e), provided the convexity assumption on lu holds (which can be verified 
a posteriori) . 

The integral in (j5.36p will be dominated by the contribution from the value of e which maximizes 
the exponent, which is given by 

eo(i^) - -a,^(z.) . (5.38) 

The partition function computed in (|5.36|) is therefore averaged only on those values of disorder that 
give a ground state energy equal to Nto{v): the parameter v allows to restrict the distribution of the 
disorder to some subset with a well defined ground state energy. In this regard, it plays a role similar 
to a chemical potential in thermodynamics. 

Let us now turn to the application of this program to compute the replica symmetric free energy of 
fc-SAT over the Satisfiable ensemble V^atl^]- The strategy will be to substitute the replica symmetric 
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ansatz (|5.29p for c(-) in the free energy (|5.24p . and take the hmits (3 ^ oo and n ^ with finite 
V = (3n, to obtain a free energy functional depending on analogous to ip{v) in (|5.36p . and which 
will have a functional dependence on R{h). Then to derive the saddle point equation corresponding 
to (|5.27p and which will determine R{h), and solve them for generic v. We shall compute the average 
ground state energy as a function of i>, as in (j5.38p . and find the value of v corresponding to zero 
energy, which will select satisfiable formulae. The equilibrium distribution of fields R{h) over the 
Satisfiable Ensemble will finally allow us to characterize the solutions. 



Entropic term 

The entropic term of (|5.24p . 

y[ci-)]^-Y,c{f)iog4T) 

T 

can be computed by means of the following identity: 



a;logx 



We obtain: 



with 



5^c(fri = Y.\ r dhR{h) 



exp 



dp 



p=0 



2 l^a=\ ' 



p=0 



p=0 



[2 cosh 



dhi ■ ■ ■ dhp^i R{hi) ■ ■ ■ R(hp+i) 



2 cosh 



2 1^3=1 'h 



2 cosh • • ■ cosh 2 



We can now multiply by 




dxdx 



2tt 



to obtain 



^e-* f2cosh^ 
27r \ 2 



dh 



R{h)e 



-ixh 



(2 cosh if 



P+i 



By taking the derivative as in (|5.4ip we find 

dx dx 



(3x 



2^ e-M2cosh^) 0(x)log0(x) 



where 



/OO 
dh 
OO 



R{h)e 



— ixh 



OO I 2 cosh 



0h 



In the limit /? — > oo, n — > with finite v = (3n we have 



(5.39) 



(5.40) 



(5.41) 



(5.42) 
(5.43) 



(5.44) 



(5.45) 



(5.46) 
(5.47) 



lim 

n-*0 



2 cosh ■ 



2n 



(5. 
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and 

cIt cIt ' la; 

n<-)] = - ^— e"^+''^0(x)log(/<(x) (5.49) 



-oo 27r 



with 

/■oo 

dhe-^^-^^Rih) . (5.50) 



Energetic term 

For the second term in (|5.24p . we have 

(S'[c{-)] = alog ^ c(fi) • ■ •c(ffe)£:(Ti, . . . ,Tfc) 



alog ^ c(ri)...c(ffe) exp<j -/3^n^(^/'l) 

a=lj = l 



(5.51) 
(5.52) 



where I have sinipHfied the expression (j5.19p of the effective couphng £ taking profit from the sum 
over Ti. Substitution of rephca symmetric ansatz (|5.29p gives 



/oo 
dhi ■ ■ ■ dhk R{hi) ■ ■ ■ R{hk)x 
-OO 



Ti,...,rfc 



n k 



(2 cosh 

/>oc 

a log / dhi ■ ■ ■ dhk 



{2 cosh ^ 
R{hi) 



(5.53) 



a=lj=l 



i?(/l, 



2 cosh ^ 



2 cosh ^ 



^exp /3 



(5.54) 



As /? ^ 00 the sum over r is dominated by the term which maximizes the square parenthesis in 
()5.54p . while the hyperbolic cosines are given by (|5.48p . so that: 



(5.55) 



S[c{-)] = a log / dhi ---dhk R{hi) ■ ■ ■ R{hk) e'"^^^'^ 
J —00 

with h = {hi, . . . , hk) and 

1 

>(h) = ^max -^(T,/i,-|/i,|)-I[r,l] 



$( 



— min{l, hi, . . . , hk} if hj > Vj 
otherwise 



The free energy functional we obtain, putting and .y together, is: 
dx dx 



^[Ri- 



2lT 



cc) log (/)(a;) + a log / dhi ■ ■ ■ dhk R{hi) ■ ■ ■ R{hk) e 



(5.56) 
(5.57) 

iy*(h) 

(5.58) 



with 0(x) and $(h) defined in ([530]) and ((537)) . 
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5.2.4 Saddle point equations 

We are now in position to derive the saddle point equations that wiU determine R{h) from the 
extremahty condition (j5.27p for J^[R{-),i^,a], subject to the normaUzation condition (j5.30p . which 
we write as 

S 



SR{ 
o - 



a] + \ 

dx dx 



j R{h)dh 



1 



oo 2^ 
ak 



-- (5.59) 

DO 

d/i2 • • • dhk i?(/i2) • • • R{hk) e''*^''''^--^'^) + A = (5.60) 



where 



/oo 
dhi--- dhk R{hi) ■ ■ ■ R{hk) e''*^^) (5.61) 

and where it should be noted that the integral over the fields hj in (|5.60p starts with both terms 
are functions of h, and they must be identically null. 

In principle, the symmetry condition R{h) = R{—h) should also be imposed, by means of a second 
Lagrange multiplier. However, it suffices to restrict the range over which (|5.60p defines R{h) to positive 
values of h and define R{—h) = R{h) with h > 0. 

Because of the definition (|5.57p of <&(h), it is convenient to write the integral over the fields in 
(|5.60p over only. This can be done by noticing that if one of the hi is negative then <I>(h) — 0, so 
that 

/•oo 

^iy4>{h,h2,...,hk) 



/oo 
dh2 ■ ■ ■ dhk R{h2) ■ ■ ■ R{hk) e" 
-OO 

/OO 
dh2 ■ ■ ■ dhk R{h2) ■ ■ ■ R{hk) + 
-OO 

+ dh2--- dhk R{h2) ■ ■ ■ R{hk) (e'^*(''.''2... _ 1^ 



(5.62) 



1 f^" 

= l-^ + y d/i2---d/ifci?(/i2)---i?(/ifc)e-'^™"^''"'''^-''"'^ (5.63) 

because of the normalization and the symmetry of R{h). We now multiply by the identity 

l°° dydy_^^y[y-^i^{ij,^.,,,Mk}] ^ I (5 54) 

i-oo 27r 



to obtain 



iy<S>ih,h2,---,hk) 



Notice that 



/oo 
dh2 ■ ■ ■ dhk R{h2) ■ ■ ■ R{hk) e 
-oo 

^ < f ^-v mini h,y}-iyy 

POO 

X / dh2---dhkR{h2)---R{hk)e'y'^''^^^''"'--'"'K (5.65) 



mm{h, y} = i h + y-\h-y\ (5.66) 



2 

so the exponent in the first integral of the previous equation can be written as 



■iyy ~ ^''{h + y) + ^i^lh ~ y\ (5-67) 
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and changing the integration variables to x = y — and x = h ~ y wc obtain 

dh2--- dhk R{h2) ■ ■ ■ R{hk) e'-^C^^''^'-.''^-) 
1 dxdx 



1 _ ~ j I — — ixx-\'^y\x\—ixh—^uh . . 

2^-1 + ./ ^ 2^ ^ ' 



— OQ 



poo 

X / d/i2---d/ifci?(/i2)---i?(/ifc)e''^^+^'''™"^^'''"'- -'"•^ (5.68) 
Jo 

The exponent in the integral over dx dx is the same as in the first term of (|5.60p , and 

axax^^^s:+^u\x\-^xh-^uh^ / dx 5{x ~ h)eM\^\-'^''^ ^ 1 (5.69) 
-OO 27r 

since ft, > 0, so wc can collect all the terms in (|5.60p under the same integral. Let us define the 
following functions 

/OO 1" 
-OO 27r 

Q{x) = / dft2---dftfci?(ft2)----R(/^fc)e""""^i'''^--''^> (5.71) 
Jo 

in terms of which the saddle point equation (|5.60p becomes 

+ aI=0. (5.72) 



rylc 

dxK{h,x) I -[1 + Iog0(a;)] 



mi-)] 



2.-1 ' + 



A solution to this equation is obtained if the curly bracket vanishes identically. In that case, 
inverting (|5.50p wc obtain 

R{h) = I ^ e"''+^''V(a;) (5.73) 
J-oo 27r 

+ aI. (5.74) 



dx \ 1 (y.k 

-exp|zxft+-./.-l + ^^ 

5.2.5 Distribution of fields 



We are now in position to determine the distribution of fields R{h) that satisfies the saddle point 
equation (|5.74p . Since this is a functional equation, it's resolution is greatly simplified by making 
some assumption on the form of the function. I shall consider the following ansatz for R{h), 

OO 

R{h) = J2 rpS{h-p) (5.75) 

p— — OO 

where only integer values of h are considered. I shall later prove that a more general form in which 
fractional values are considered reduces to this, suggesting that this is the only solution. 

With this assumption (|5.74p becomes an equation for the coefficients {rp}. Let us begin from 
()5.6ip by computing 

/OO 
d/ii ■ • ■ dhk R{hi) ■ ■ ■ R{hk) e"*''') (5.76) 
-OO 

— oo,oo l,oo 

= ^ V-Tp, xl+ ^ V-^P. (e"'-l) (5-77) 
pi,...,Pfc pi,...,pfc 

= l+f^^Vle-^-l) ^ D (5.78) 
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where I used the fact that, for integer {hj}, if some hj is negative or null then $(h) is and otherwise 
it is 1, and the symmetry and normalization of R{h) which imply that rp — r-p and X]^-cxj ~ 
Similarly for the term in Q in the exponent of (|5.74p . which can be written 



-^1^ min{l,/i2,---,^A;} 

(5.79) 









poo 
















Jo 



E 

P2,---,Pk 



l + ro 



fc-i 



1+ E 

p 

l-ro 



' P2 ' Pk 



cos(a;) e - I 



P2,---,Pk 

k-l 



cos(2;) e 2'' — 1 



with 



A + Bcos{x) 



A 



B 



l + r-o 



l-ro 



fe-i 



fe-i 



1 -n 



fe-i 



Substituting (|5.78p and (|5.82p into the saddle point equation (|5.74p gives 



R{h) = e 



00 7 

A' + iiy|/i| / ^ 



cos(x/i) exp < ak 



A B , ' 
- + -cos(x) 



where h can be positive or negative, and 



1 - 



9fc-l 



(5.80) 

(5.81) 
(5.82) 

(5.83) 
(5.84) 

(5.85) 
(5.86) 



This form is compatible with the ansatz (j5.75p . since it vanishes unless h is an integer, and we can 
invert it to obtain 



„A' + ii/|/i|+afc^ 



dx 
27r 



e^Pexp 



afc— cos(a;) 



(5.87) 
(5.88) 



where Ip{x) is the modified Bcssel function of integer order p. The value of A' is determined by the 
normalization of R{h), and we obtain 



(5.89) 



In this formula, B depends on tq. It is therefore an equation for tq and, once solved for tq, and 
identity for all other values of p. 



5.2.6 Ground state energy 

Having obtained the explicit expression of the equilibrium distribution R{h), we can compute the 
average value of the ground state energy density eoi^) for general ly. 
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Following (j5.38p . we write from the form of the free energy density functional (j5.58p 
d 



dv 



(5.90) 



dx dx 
2tt 



e^^^+h'^\^\ \\x\(j){x)\og(t){x) -[l + \og(t){x)] / d/ie-"''-^''!''! |/i|i?(/i)l + 



(5.91) 



The integrals dx dx can be eliminated by means of the saddle point conditions ()5.74p and ()5.60p , 
which give 

log m = A' + ak r dh,--- rf/^fe ^^^!l;^:y'^ e("-^) -Mi^''--.'^'^} (5.92) 



dx dx 
2tt 



e^^+s'^l*! [i + log0(a;)]e-"''-^l''l = Aafc / dh2 ■ ■ ■ dhk 



R{h2) ■ ■ ■ R{hk) 



^u^{h,h2,...,hk) 



(5.93) 



from which we obtain 



eoii^) = - / hR{h)dh + 



+a / dhi ■ ■ ■ dhk 



R{h2) ■ ■ ■ R{hk) .... , . 

dli2 ■ ■ ■ dhk M mm{l, h2, ■ ■ ■ , hu] + 



R{hi) ■ ■ ■ R{hk) 



■ min{l, /i2, . . . , hk} + (1 - fc) min{l, /ii, . . . , hk} 



xe 



— u inin{l,/ii ,hf^} 



(5.94) 



This expression is valid independently on the form of R{h). For the ansatz ()5.75p we have 

OO l,oc 



J l.oc , l.oo 

ak -^-^ r^^ ■ • • r„, ak ^sr-^ 



' P2 ' Pk 



p=l 



P2,---,Pk 



D 



P2,---,Pk 



D 



l.oo 



l,oo 



Pl,---,Pk 



'"'pi ' ' ' ^Pk_^-iy 



Pl,---,Pk 



D 



OO 7/1 

Eak / 1 — ro 



p=l 

OO 



fc-1 



" D 



l-ro 
2 

-I//2 



(5.95) 
e-" (5.96) 
(5.97) 



where the term corresponding to pi =0 in the first term of the second line of (|5.94p has an extra 
factor 1/2 coming from the integral J^°° 6{x)dx. The sum in the last expression can be computed as 



^Prp 

p=i 



d 



log J' ( ak — , V 
5 1 



where 



P— — 00 

converges very fast for v > 0. 



,1/)= ^ e-^''IPl/p(x)=2e=^™'^'^('^/2)_j^(^)_2^e-''/2/p(x) 



(5.98) 



(5.99) 
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Large i/ expansion 



I am going to sliow that tlic condition eolv) = 0, which corresponds to the selection of satisfiable 
formulae from the ensemble T'sati-?^], is obtained for ly oo. 
Let me denote e = e~'^ and, to first order in e 



G = 



2 D 



' l-rg 
V 2 



l-(i^)^l-e) 



1-e 



2'S 1 - ro 



\k 2 



' l-lo 



, fc-1 



The Bessel functions /p(x) can be expanded for small x and p > as 



/p(a;) 



(5.100) 
(5.101) 

(5.102) 



4(p+l) 

and I^p{x) = Ip{x). Since from the definition (j5.84p of B we have that it is 0(e^/^) while from (|5.78p 
we have D = 0(1), we can expand 



p— — OO 

oo 



p— — CXD 

OO 



'akB_\\P\ 
- 2 £1/ 

2lpl|p|! 



E 



bl 



l + £ 



G2 



bl + 1 



'+4(H + 1)+^^' ^ 



+ 0(£^) 



P— — 00 

2e^ -1 + e {2Ge^ - - 2G) + 0{e^) . 
We can then write, in the equation (|5.89p for rg, to the leading order in e: 

/o (2Ge'^/2) 



2eG - 1 + e (2GeG - G2 - 2G) + 0(£2) 

1 + eG^ + 0{e^) 

2eG - 1 + e (2GeG - - 2G) + 0(£2) 



2eS 



^^o(ro)+£Fi(ro)+0(£ 



2e^^ - 1 



a/c 2 



2^ 



0(£ 



Let me define 



Pi 



lim To , 



lim 



so that To = Po + £Pi + o{e). The value of po is determined by the equation 

1 

~ 2e»'(Po) - 1 ■ 

The value of pi is obtained by developing (|5.110p around po^ 

Po + £pi = Po + Fo(po)£po + £-Fl(po) 



(5.103) 
(5.104) 

(5.105) 
(5.106) 

(5.107) 

(5.108) 

(5.109) 
(5.110) 

(5.111) 
(5.112) 

(5.113) 
(5.114) 
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which gives 



Pi 



Fiipo) 



In order to write the average ground state energy for large v we also need to compute 



E^uvT ( Oik B 



1 



P+l 



0{e 



g(p-l)! 
Ge^ +eG{l- + Ge^) + 0{e^) . 



(5.115) 

(5.116) 
(5.117) 



Using these expansions in the expression for the average ground state energy (|5.97p we obtain 
after some algebra, 

GeG+eG(l-eG + GeG) 



'2eG -l + e (2GeG - G^ - 2G) 



G 



1 + ro 



e T-1 G 



1 - Po 



ak 2 
2^ 1 - Po 



1 



ak 2 
^Vo(l-po) 



£ [e-* - 1 + ^] - epo 



- ^2 _ 2S 



^2-25 



1 



2 

^ - 1 



(5.118) 

(5.119) 

1 - Po 



2 

(5.120) 



where everything except eisO(l)asj^— »oo. Notice that the term in pi does not contribute to the 
first order result in the end. 

The conclusion of this somewhat tedious calculation is that 



eo(j^) 



(5.121) 



Therefore, in order to obtain the equilibrium distribution of fields for formulae extracted from the 
Satisfiable Ensemble 7^Sat[-^], it is sufficient to take the limit i/ — > oo in (|5.89p . giving 



Po = 



2e^^(po 



Pp = lim rp = 



^^(Po) 



\p\\ 2e^(p")-l 



(P^O) 



where ^^(po) is defined by (|5.10ip as 



^(Po) 



V 2 



, fc-l 



1 



, 2 ^ 



(5.122) 
(5.123) 

(5.124) 



For any k and a it is easy to solve (|5.122p to find po, and then use it to compute all other pp, thus 
completely defining the distribution of fields R{h). In the following we shall see that this is sufficient 
to characterize some very interesting properties of the solutions. We shall also return on the two 
ansatz we made to derive these results: the replica symmetric form (j5.29|) of c(-), and the integer-only 
form of R{h) in ()5.75p , in the Section 15.51 about the stability of the solution. 



5.3 Cavity formalism for the fields distribution 

The results of the previous Section can be obtained in a rather more straightforward way, at the price 
of some more assumptions. 
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Let us consider a formula over A'^ — 1 variables, and let us add a new variable, which will appear 
in ^_|_ new clauses as a non-negated literal, and in £_ as a negated one. For random formulas from the 
Uniform Ensemble, ^+ and £- will be random variables with independent poissonian distribution 

p.iO = (5.125) 

where a' is some constant that we shall determine later. 

Let us denote by 1 — po the probability that an "old" variable is constrained, i.e. if it changes 
value some existing clause will be violated. Then, the new variable will be constrained if and only if 
all the k ~ 1 other variables in the clause are constrained, and if they appear with the "wrong" sign 
in the new clause. The probability for this to happen is 

(5.126) 

The number of clauses that contain the new variable x or its negation x and which constrain them, 
which I shall denote m+ and to_ respectively, will be independent random variables with distribution 

PM{m) = f;piWf^)<z'"(l-g)'-™ (5.127) 



ml 

I shall also introduce a weighted distribution, in which i is the weight, for later use: 



(5.128) 



PJiim) = 5^^PL(^)(l)g'"(l-<7)^-™ (5.129) 



PAiim) 



^(1-.) 



(5.130) 



(notice that this is not normalized, since X)m=o^'A/('^) ~ a'k/2). 

The m+ clauses that constrain x will be satisfied if x = true, while the m_ clauses that constrain 
X will be satisfied if a; = false. The minimal increase in energy after the addition of x to the formula 
is therefore 

= min{TO_|_, m_} . (5.131) 

Let me define the "magnetic field" h as the difference m+ — m_. Both /S.E and h will be random 
variables, with joint distribution 



oc 



P{AE,h)= ^ PM{jn+) ^ PM("^-)'5A£;,min{m+,m_}'^/i,m+-m_ • (5.132) 
m-i-— m_— 

In the spirit of Paragraph 15.2.31 I am going to weight each possible new formula with a factor 
g-iyA_E_ rpj^^ probability that the new variable is subject to a field /i = p G Z is then 

^ ~ EA.>oE™._ooi^(Ai?,m)e-A. • (5-133) 

In order to restrict the computation to satisfiablc formulaj, let us take the limit oo, so that 
only formula: with AE = contribute. The probability that the new variable has zero field (i.e. that 
it is not constrained) is then 

po = lim roiv) = ^ ^'^^ pL T = o a'ko/2 T (5.134) 
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and since g is a function of po defined in (|5.126p . this expression is an equation which determines po. 

If we had no restrictions on the clauses added to the formula, their average number would be 
ak. However, we are restricting the ensemble to satisfiable formulae only: some of the potential new 
clauses will have to be rejected, because they would make the formula unsat, and the average number 
of clauses effectively added will be 



1 - 



1 - Po 



In order for a to be the clause to variable ratio of the formula, wc must impose 

kl 



a = a 



1 



1 - Po 



(5.135) 
(5.136) 

(5.137) 



which determines a' . 

Multiplying on both sides by kq/2 and recalling the definition of q we obtain 

, k~l 



a'kq 



, 2 



1 



. 2 / 



which, compared with (|5.124p . gives 
The equation ()5.134p for po is then 



^(Po) 



a'kq 



1 



Po 



(5.138) 



(5.139) 



(5.140) 



2e»(po) _ 1 

which is exactly the same as ()5.113p . 

Notice that the distribution that we have computed is the distribution of the cavity fields, i.e. the 
fields acting on the new variable and generated by the old ones. A priori this distribution is different 
from that of the real fields, which include the effect of the new clauses on the values of the old variables 
(and therefore of the fields they induce). The distribution we are interested in is the one of the real 
fields, which is what we have computed by means of the replica calculation, not the distribution of 
cavity fields. However, it can be shown that these two distributions coincide in the case when they 
are poissonian. I shall now prove that this is indeed so, at least in the limit of large a. 

The generating function g{x) of the distribution of variable occurrences + £- over satisfiable 
formulae, i.e. such that min{m+, m_} = 0, can be computed as 



six) 



OO 

E 

m-\-—l 



E 



Pl( 



oo 



oc 



E E 



^a'k{x-l){l-q) 



2g(i kxq/2 



q"'+{l~qY+-"'+ X 



-{1-qY—"'- X 



^0, 



min^ m,-\- .rii 



-} 



1 



2f,'y-'kq/2 _ I 

For a ^ oo we see from ()5.134p that po ^ and from (j5.137p that a' = 0{a), so that 



(j,) ^ ga'fc(x-l)(l-9/2) _^ g-O(Q) ^ ^ak{x~l) 



(5.141) 

(5.142) 
(5.143) 
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which is the generating function of a poissonian distribution of parameter ak. 

The conclusion of this Section is that the interpretation of the field h as the number of clauses that 
are violated if a variable is flipped is correct, and the distribution of fields R(h) is the distribution 
over the variables and the formula; from Vsa.t of their values. 

5.4 Comparison of Vsat and Ppiant at large a 

I am now going to use the distribution of fields computed in Section (|5.2p to show that, for a — > oo, 
the statistical properties of formulae extracted from Vsat coincide with those of formulae from T'piant- 
For a ^ cx) the solution to (|5.122p . (|5.123p and (|5.124p is 

^0 - (5.144) 

7 ^ ^{0)^^^. (5.146) 

Since 7 = 0{a), this means that the fraction of variables that are not constrained is po ~ e^*^^"-*: 
the solutions to a satisfiable formula at large a are all very similar to each other. Moreover, the 
average value of the fields is 0(7) = 0(ct)^ so the constrained variables have strong fields that force 
them to the correct assignment. 

5.4.1 Distribution of fields 

I shall now compute the distribution of fields for formulae extracted from the Planted Ensemble Ppiant- 
Let us consider a configuration X, and a random clause C satisfied by X. If one variable Xi is 
flipped, what is the probability q that C is no longer satisfled? It is the product of the probability 
that C contains Xi, which is fc/7V, times the probability that all the other literals in the clause have 
been chosen with the wrong sign, which is 1/(2*^ — 1) 

q=-^—. (5.147) 
The number p of such clauses will be a random variable, with a binomial distribution P{p) of parameter 

P[v)={^^^q^{\-qf'-^ . (5.148) 
For N ^ 00 this reduces to a poissonian of parameter ak/{2'' — 1), which is 7 defined in (|5.146p . 

P{p) ^ e-''^. (5.149) 

N^oc pi 

In a random configuration X, half the variables will be true, giving rise to positive fields, and 
half will be false, giving negative fields. The distribution of fields, i.e. of the number of satisfied 
clauses that are violated if a variable is fiipped, with the plus sign if that variable is true and minus 
otherwise, is 

pP'-* = Sp,o e-^ + (1 - 5p^o)^e-^^ . (5.150) 

Comparing with (|5.145p we see that the two distributions of fields corresponding to the Satisfiable 
Ensemble at large a and to the Planted Ensemble differ by terms e~^^°'\ 
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5.4.2 Correlation between field and number of occurrences 



Not only the typical magnitude of the fields in formula; from T'sat is of order a at large a, but it is 
correlated to a bias in the distribution of the relative number of occurrences of variables and their 
negations, as I shall prove with the following computation. 

In order for a formula to be satisfiable, there must be no variable that receive contradictory 
messages, i.e. which is constrained by some clauses to be true and by some other to be false. If 
we assume that the field on the variable is ft, > 0, this means that the number 7ti_ of clauses that 
constrain it to be false must be 0, while the number m+ of clauses that constrain it to be true will 
be positive or null. 

Let us denote by (^+)^>q the average number of occurrences of such a variable in clauses where 
it is not negated, and by (£_)^^q the corresponding number for its negation. These will be random 
variables whose distribution can be expressed in terms of (j5.128|) and (j5.130|l as 

E™,>iPl/(™+)m(0) 



(^).>o 



Em+>lPM(TO+)pA/(0) 



(5.151) 



where in the numerator pji/(0) is the probability that the number of clauses sending a negative message 
to the variable is 0, pj^{m-^-) is proportional to the average number of occurrences of the variable 
conditioned on the message it receives being positive, and the denominator is a normalization. 
Using the explicit distributions (|5.128p and (|5.130p we have 



m+ >1 



m-|- ! 



^(1-9) 



X e 



'a'kq/2 



+ /h>0 



a'k 

~Y 

ak 



y 

1 -(1-9)6- 



-a'kq/2 ^ ^~a'kq/2 



2 1-2- 



-0(a) 



E 



m+>l 



(a'kq/2)"^+ -a'fco/2 



X e 



'-/h>0 



(a'A;g/2)"'+ „'kq/2 ^ „-a'kq/2 
2^m+>l m+! ^ ■i' X e 



-0(a) 



2 1-2-*^ ' 
from which we obtain the average value of the bias 



1 



-0(a) 



2'^- - 1 



(5.152) 

(5.153) 
(5.154) 

(5.155) 

(5.156) 
(5.157) 

(5.158) 



(^+).>o + (^-)/.>o 

Therefore variables with positive field appear more frequently non-negated than negated. Of 
course, the opposite is true for variables with negative field. 

The same computation can be easily performed for formulae from the Planted Ensemble. Given a 
configuration X and k indices of variables composing a clause, out of the 2^ possible choices of the 
negations of the corresponding literals only 2^ — 1 will give satisfied clauses. If a variable x is true in 
X , then the number of satisfied clauses in which it appears non-negated is 2*^"^, corresponding to the 
random choices of the signs of the other literals; the number of clauses in which it appears negated, 
however, will be smaller, as at least one of the other literals must have the proper sign to satisfy the 
clause, giving 2'^"^ — 1 possible choices. 
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Since the average numbers of occurrences of x and x are proportional to these probabihties, we 
shall have 

(^+)piant - (^-)piant ^ 1/2^-^-1/(2^-^-1) 

(^+)pia„t + (^-)pia„t 1/2^-1 + 1/(2^-1-1) ^ 

= (5-160) 

Comparing with (|5.158p . we sec that the distribution of the bias in the Planted Ensemble is the 
same as in the Satisfiable Ensemble at large a, up to terms e~'^'^°'\ 

5.4.3 Finite energy results 

The results of the two previous paragraphs extend to formulae with small positive energy, i.e. which 
are not satisfiable. 

The average value of the ground state energy, given by (|5.120p . greatly simplifies for large a, giving 

eo(^) = J[l + 0(7'e-^)] e-'^ (5.161) 

with 7 = 0(q:) defined in (|5.146p . 

The computation of the bias (j5.15ip can be generalized to finite large values of ly by including 
positive values of m_, weighted with a factor e~'^"^- . To first order in e'" only m_ = 1 contributes 
and we have 

E^,>iK/("H) Eo<„,„<^,m(m^)e— 

Em,>iPM{m^) Eo<™„<™,PM(m_)e— - ^'-'"'^ 

Em^>lPjlim+)PMiO) + Em^>2PMi^+)PM{l)e-'' 



E,n+ > 1 PM (^+ ) PM (0) + E„H >2 i^M ("^+ ) i'M (1 ) 6 

Computing the sums as for (|5.15ip . we obtain 



- + Oie-'n- (5.163) 



(^+)h>o + (^-)h>o 2'-^ 2(2^-1)^ 
where we can use (|5.16ip to eliminate e^" and obtain 



1 - eok2 



1 I 2 



2 2^+1 - 2 



+ o(eo). (5.165) 



We see that as long as eo <C 2 /k the bias remains of the same order as for satisfiable formulae. 
5.4.4 Algorithmic implications 

In this section, I have shown that the distribution of fields pp and the average bias obtained for 
formulas extracted from the Planted Ensemble coincides with those for formulae extracted from the 
Satisfiable Ensemble for large enough a, and that this extends to finite energy formulae from the 
Uniform Ensemble, provided the energy is eo <C /k. 

The demonstration of [THj of the convergence of WP is based on the following facts, which are 
proved for the Planted Ensemble: 

• At large a, typical formulae have a large core, i.e. a set of variables that take the same value in 
all the solutions to the formula. The fraction of variables that are not in the core is e~^^°'\ 
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• The cavity fields corresponding to core variables and computed for satisfying assignments are of 
order 0{a). 

• Even for random assignments, the cavity fields are of order 0{a). This is due to the fact that 
the value of core variables in satisfying assignments is correlated to a bias in the relative number 
of occurrences of the variable and its negation in the formula. 

As we have seen in this Section, each of these properties holds as well for formula; drawn from 
the Satisfiable Ensemble T^sat, provided a is large enough. This supports the conclusion that the 
convergence of WP should extend to "Psat- I therefore claim that Hypothesis Ip, formulated at the 
end of Paragraph l5.1.41 is refuted by WP for any p > 0. 

Moreover, a probabilistic version of Hypothesis 2 states that 

Hypothesis 2p For every fixed e > 0, even when a is an arbitrarily large constant (independent 
on N), there is no polynomial time algorithm that on most Random-3-SAT formula; outputs 
TYPICAL and outputs not typical with probability p on formulae with (1 — e)M satisfiable 
clauses. 

The finite energy results of Paragraph l5.4.3l support the claim that Hypothesis 2p is refuted by WP 
for any p > provided e ^ /k. 



5.5 Stability of the RS free energy 

The conclusions of the previous sections are based on two ansatz: that the order parameter c(-) has 
the replica symmetric form (|5.29p . and that the distribution of fields R{h) is non zero only for integer 
values of the fields, in (|5.75|) . 

In this Section, 1 shall support the claim that these two ansatz are correct. In order to do this, 
I shall prove that more general solutions for the saddle point equations that determine R{h), which 
are non zero for fractional values of h, reduce to the ansatz, i.e. that the non-integer contributions 
vanish. Then 1 shall prove that the eigenvalues of the stability matrix of the saddle point equations 
computed for the replica symmetric form of c(-) are all negative for large enough a and v ^ oo. This 
does not prove that the ansatz corresponds to a global minimum, but only to a local one. In order 
to rule out the existence of other solutions to the saddle point equations, 1 shall prove that two real 
replicas of the formula necessarily have the same distribution of fields, and therefore must be in the 
same thermodynamic state, which is therefore unique. 



5.5.1 Solutions with non-integer fields 

Instead of the integer valued ansatz of ()5.75p . let us assume that R{h) takes the more general form 

R{h)^ H rpSih-^] (5.166) 

p— — oo ^ 

where q is an integer larger than 1. Substituting this assumption in the saddle point equations (|5.74p 
gives the following functional equation 




(5.167) 
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which must be true for any x, where fi is a constant, and where 

1 — w 
'"j 



A, ^ ^ 7 . (5.168) 



(w - Tj-i)'^ ^ - (it; - rj)*-' ^ 
1 — w*-' 



A, = ^ ^ '-^ (l<.7<g), (5.169) 



{w - rp _i)'=~^ 

1 - 
1-ro 



^ -P-^ , (5.170) 

1 — w"^ 



(5.171) 



2 

The value of /i can be determmed by taking x = iv and then sending v — > oo, which gives 

By taking instead x = and sending v oo one also obtains that 

ro = . (5.173) 
Combining these two identities, we obtain an equation for tq: 

ro = r-^—. ■ (5.174) 



2 exp 



ak 1 



2 



1 



Notice that this is exactly the same equation (|5.122p and (|5.124p that we have obtained with the 
ansatz of integer fields (|5.75p . 

For J = 1 we have from ()5.167p : 

ri=ro^ (5.175) 



= - fe ■ (5.176) 



2 

which can be written as 

To w^~^ — {w — ri)*"'"^ 
y 1 - 

Notice that ri = is a solution of this equation. The derivative with respect to ri of the right hand 
side is 

r^{k-l){;w-r^ _ (5.177) 
2 1 - w'^ ^ ' 

When a is large, = e"'^'^"' and ui = 1/2 — e^'^'"^ The possible range of value of ri goes from 
to w (which is the probability of the field being positive, and therefore must be larger than or equal 
to ri ) . For large enough a this derivative is much smaller than 1 for any of the possible values of ri , 
and therefore there cannot be another solution to (|5.176p . 

A similar argument can be constructed for any of the coefficients Vp corresponding to fractional 
values of the field, showing that only integer values are admissible among rationals. Of course, 
this doesn't prove that other distributions R{h) satisfying the saddle point equations and involving 
irrational fields cannot exist, but it is a rather strong indication that the ansatz (|5.75p is correct. 



5.5.2 Eigenvalues of the stability matrix 

The stability matrix of the free energy (|5.24p is defined as its second derivative. 
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which gives: 



M^? = - 



c(a) 



53,T - 



\k{k - 1) Ega-.-fffc ■ ■ ■ c(iTfc) g((7, f, g3, . . . , Bk) 

Effi...ff, c(ai)---c(CTfe)£:(CTl,...,CTfc) 



+ 



>^fc^ Eff^-ff. ^(0^2) • • • c(CTfc) CT2, . . . , CTfc) Eff' ...ff; c((T^) • • • c(4) £:(f, 4, . . . , 4; 



[Effi...ff, c(cti) ■ • ■ c(CTfc) £■((?!, . . . , (Tfc) 



(5.179) 



The solution c(-) of the saddle point equations, given by the equations (|5.29p . ()5.75p and (|5.123p . 
can be written as 



c(a) 



1 



2e- 



-{exp 











+ exp 





1 



where 



and is defined in (j5.124p . In the limit v 00 this reduces to 
For large a this further simplifies, as = 0{a) so that 



c(c?)= 2'5|.,|,i+e-«(") 



(5.180) 
(5.181) 

(5.182) 

(5.183) 



We can now compute the sums that appear in the expression of M. In order to do this, let me 
recall the definition of the effective interaction £ from (|5.19p : 



c(cti) • ■ •c((Tfc)f ((Ti, . . . ,(7fc) 



{-1,1} 



n k 



E c(?i)---c(a.)^ expi-/3^n<5(a;,g,)l . 



(5.184) 
(5.185) 



a=lj = l 



In the limit (3 00, only the terms where the exponent vanish contribute. The value of £ is then 
times the number of fc-componcnt vectors v_ such that for any j ~ 1, . . . , A: and any a ~ 1, . . . , n 
we have Vj ^ cr°. Since the only a that have a non- vanishing c((7) are {ctq = 1 (Va = 1, . . . ,n)} 
and {(Tq = — 1 (Va = 1, . . . ,n)}, these n conditions arc actually identical, and only one (out of the 
possible 2^^) vector y_ is excluded. 

The sum over the k vectors aj therefore has 2*^ terms (corresponding to the 2 possible values of 
(Tj), each of which has a factor 2^^ from the product of the c(-)'s, and a factor 2^^ x (2*^ — 1) from 
the £, so that 

In a very similar way, 



S^k-l{B) = ^ c(CT2)---c(CTfc)f(CT,CT2,---,CTfc) 
1 1 



2fc-i 2*= 
1 '^kl,i 



[2'=-(2-'5n,i)] 



(5.187) 

(5.188) 
(5.189) 
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since if |s| = 1 all the columns in the matrix a will be equal (and only one vector v_ will be excluded), 
while if |s| < 1 there will be two column values (and correspondingly 2 vectors excluded). 
Finally, 



J24-2(o',f) 



^ c(ct3) • • • c{ak)£{a, f, CTs, . . . , ak) 



-,k-2 



1 1 

2k 



[2'-A{a,r)] 



(5.190) 

(5.191) 
(5.192) 



where A{<7,t) counts the number of different pairs, among the possible four which arc (1, 1), (1, — 1), 
(— 1, 1). (—1, —1), that actually occur in the set {(<t°, T")|a = 1, . . . , n}. 

We can then substitute these expression in (|5.179p to obtain, up to terms of order e~'^^"\ 



2e^ 5a. f 



ak{k-l) [2*= -^(CT,f) 



e^(5|,|,i + (l-(5|,|^i) 2^--l 
ak'' [2^ - 2 + [2^- - 2 + <5|t|a] 

{2'^ - 1)2 



+ 



(5.193) 
(5.194) 



where t is defined for f as s for a. This matrix is invariant under the exchange of replica indices, and 
therefore it can be block-diagonalized in subspaces of well-defined replica symmetry. 
In order to take into account the normalization constraint 



it is convenient to decompose the dependency of J^[c(-)] in two, writing 



(5.195) 



^[c(ct)] = JT' 



1 - ^c'{a),c{a) 



(5.196) 



with c'(it) = c((t) for every a except <t = 1 = (1,...,1), and otherwise, and where J^' is the 
functional defined by the previous identity. The stability matrix of is then 



(7T 



-2] 



ak{k — 1) 
2^ - 1 



I + A{a,f) - A{l,f) - A{a,l) 



(2*^ - 1)2 



In non-symmetric subspaces, |s| ^ 1 7^ and the previous equation becomes 



-2e^ fcf-2 



ak{k - 1) 
2'= - 1 



1 + A{a, f) - A(l, f) - A{a, 1) 



Aak' 



If 



(5.197) 



(5.198) 



(5.199) 



The diagonal terms of this matrix arc of order 0(e"), while the off-diagonal terms are of order 0{a). 
The contribution of the off-diagonal terms to the eigenvalues will be given by 2" terms, each of 0{a). 
Since the contribution of the diagonal terms to the eigenvalues is of order 0(e") and it is negative, 
this ensures that for large enough a all the eigenvalues will be negative. In this subspaces, the replica 
symmetric solution is therefore a local maximum. 
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In the symmetric subspace, all the diagonal elements of M^- are of order 0(e"), except the term 
corresponding to cf = f = —1: for this term, the exponential contributions vanish. However, we can 
then write 

M^- = ~2 Sgr + ^3r (5.200) 
and treat V as a perturbation. For a = t = — 1 the matrix element of V is 

v_ j_ _j = M'_ _ J = M_r^ _r - M_ j_ J - Mr_ _j + m^^ ^ (5.201) 

and from (|5.179p this is equal to —4, so it is negative. 

The conclusion of this analysis is that, for a large enough, all the eigenvalues of the stability matrix 
of computed for the c(-) which satisfies the replica symmetric saddle point equations, are negative, 
and therefore that this solution is locally stable. 



5.5.3 Uniqueness of the solution 

The conclusion from the previous Paragraph cannot rule out the existence of other solutions to the 
saddle point equations, which could possibly be the true global maximum of ^ . 1 shall now provide 
an argument supporting that the saddle point equations have a unique solution, which is therefore 
the one found in Section 15.21 

Let us consider two real replicas of the system, i.e. two identical satisfiable formulae. I shall indicate 
by a the thermodynamic state of the first replica, and by /3 that of the second one (the context will 
make it obvious when a refers to the clause to variable ratio of the formula). I want to study, with 
the cavity method, the joint probability for a variable of having a positive, negative or null field in 
the two states a and /3, which 1 shall denote by the following quantities: 

aB aB ciB 
P++ P+0 P+- 

Po+ Poo Po- (5.202) 

a/3 a3 a3 

P-+ PJo P-- 

What I want to prove is that for large a: 

• The off-diagonal terms become negligible, so that the fields are equal in the two states for most 
variables; 

• The term p"q is much smaller than p"'^ and p'^^_ . 

The consequence of these two properties will be that most variables will be constrained to take the 
same values in the two states a and /?, which is therefore a single, unique, thermodynamic state. 



Distribution of the number of messages 

Let us assume that a new variable is added to the formula, appearing non-negated in clauses and 
negated in clauses. These will be two independent random variables with identical poissonian 
distribution of parameter a'k/2, where a' is some constant which will be determined later. 

A clause will send a message to a variable (that is, it will constrain it) if all the other variables 
in the clause are constrained (that is, have a non-zero field) and appear with the "wrong" sign in the 
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clause, which happens with probabihties 



Q = 



1 - iPa+ + Poo + Po-) 



1 - {p+o + Poo + P-o) 



fc-i 



fc-i 



(5.203) 
(5.204) 



respectively in the states a and p. 

For a given the probabihty that in the state a the number clauses sending a message to the 
new variable is to? is equal to 



+ 1 / n\m' 



{q'^y"+ (1 - g")' 



(5.205) 



and identical distributions are valid for m°i for fixed and for the corresponding quantities in the 
state /3. 

The number of occurrences Z+ must be the same in the two states (the repHcas are identical) , and 
must be larger than m" and m^. The joint distribution of jti" and is obtained by summing over 
the aUowed values of l+: 



aB la. P \ 

p^}'(m+,TO^) 



/3- 



E 



-a'k/2 



x(,y(,T'Mi-<,") 



1-9 



/3y + -m+ 



(5.206) 



and similarly for the negative messages. 

The joint probability of all messages is given by the product of the distributions of positive and 
negative messages, since they are independent: 



a /3 a Pi ctp / a P\ / a P\ 



a0 



(5.207) 



The values of {p°^j^ i ■ ■ ■ : p"f- } are obtained from this distribution by summing over the appropriate 
ranges the values of m±. 



Selection of satisfiable formulae 

In order to have a satisfiable formula, no variable must receive contradictory messages. This means 
that the ranges to be considered in the sums to compute {p"'^, • ■ ■ must be the following: 



Poo 


: p^f(0,0)xp^f(0,0) 


(5.208) 


a(3 
P++ 


: p^f(m^,m^)xp^f(0,0) 


(5.209) 


af3 
P-- 


: p^f(0,0)xp^f(m?,m^) 


(5.210) 


P+t 


: p^f(m^,0)xp^f(0,m^) 


(5.211) 


at) 
P- + 


: p^f(0,m^)xp^f(TO^,0) 


(5.212) 


Po+ 


: p^f(0,m^)xp^f(0,0) 


(5.213) 


Po~ 


: p^f(0,0)xpjf(0,m^) 


(5.214) 


P+0 


: p^f(m^;:,0)xp^f(0,0) 


(5.215) 


aP 

P~o 


: p^f(0,0)xp^f(m?,0) 


(5.216) 
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where all the m± are positive, and must be summed between 1 and infinity. 
I therefore define: 



So 
Si 



Pa?{0,0)\ (5.217) 

oo oo 

J2 ptk^%,<)PMim = E Plk0,0)PMim1,m'_), (5.218) 



,m^ — 1 

oo oo 



,m_ — 1 

oo oo 



E E ?'M«-0)p^f(0,m^) = J2 E Pm(0,"^^)Pm'(™-.0), (5.219) 



m° =1 -1 



^3 ^ E PM(0,m^)p^f(0,G) = E ?'m(0,0)Pm'(0,™^), 



OO 



rn/^ =1 



"^3 



A/" = 



E P^f(m^,G)p^f(0,G) = E P"f(0,0)p^f(m^,0), 

m"— 1 m^— 1 

S'o + 25-1 + 25*3 + 25^ , 



so that 



(5.220) 

(5.221) 
(5.222) 

(5.223) 
(5.224) 
(5.225) 
(5.226) 
(5.227) 



All these sums are computed by inverting the order of the sums over l± and m± and adding the 
term corresponding to 'm± = 0, for example 



p%l - 


So 
A/"' 




Pt = 


^1 
A/" 


aB 


a/3 

P+- = 


AA 


Q/3 

= P- + 


Po+ ^ 


^3 

A/" 


aB 

^Po- 


aB 
P+0 = 


"^3 

A/" 


af} 
= P-0 



This gives: 

S'o 
Si 

S2 



S's 



E E 



E E 



terms with to+ = . 



cxp{-«'fc[l-(l-g")(l-g'3)]} 



exp |-— [1 - (1 - 9")(1 - q^')] I X ^ 1 - cxp 
+ exp{-«'fc [1-(1-9")(1-/)]} 



cxp 



a'k 



cxp 
X I cxp 
cxp 
exp 



cxp 



-^(l-(l-g")(l-,^)) 



a'k 



cxp 



a'k 



(1-(1-<Z")(1-/)) 



(5.228) 



(5.229) 



(5.230) 



(5.231) 



a'k 



I -a- 9")(1 - q^) + g"] I - cxp {-a'fc [1 - (1 - g")(l - /)] } (5.232) 



[1-(1-0(1 + 



cxp 



{-a'fc[l-(l-g")(l-/)]} (5.233) 
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2exp { -— [1 - (1 - g")(l - q^)] ^ X ^ 1 - cxp 



cxp 



+2 exp |-^(<j" + + cxp |-^^ [l - (1 - g")(l - qP)] | 
The self-consistency equations (|5.203p and ()5.204p are then 



1 - 



1 - 



fc-i 



(5.234) 

(5.235) 
(5.236) 



Notice that these equations are coupled, as 5*0, S'3 and 6*3 contain both and . 



Solution of the self-consistency equations 

These equations have four fixed points, of which for a 00 only one is stable. To see it, I consider 
that as a 00, also a' — > 00 (I shall verify this later). Then, keeping only the leading exponential 
term in a', 



So ^ 53, S'j , 



S3 ^ exp|-^[l-(l-g")(l-/)+g" 
[l-(l-g")(l-/) + / 
[l~(l-g")(l-/)] 



a'k 

a'k 
2~ 



^3 ~ exp 
A/" ~ 2 exp 
The self consistency equations then decouple: 

1 — cxp 



1 — cxp —q 



k-l 



(5.237) 
(5.238) 

(5.239) 

(5.240) 

(5.241) 
(5.242) 



These equations are identical. Each admits two solutions: one for g ~ 0, and one for q ~ 1/2*^ ^ (of 
course, g = is also a solution, but a trivial one). The solution close to is 



a'k 



k-l 
~ k-2 



(5.243) 



and it is unstable, since the derivative of the right hand side is larger than 1. The other solution is 



1 



— l-(/c-l)exp 



a'k 



(5.244) 



and this solution is stable. Therefore, for a ^ 00 we shall have q" ~ q^ = q*. 

The computation of a' as a function of a is similar as the one I've shown in Section 15.31 We must 
impose that the average total number of occurrences of the new variable be 



(5.245) 
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The distribution of /_) conditioned on the formula being satisfiable is obtained by summing over 
the values of m± that give no contradictions, i.e. 

,0\l-) 



^ m",m^ — 1 

+■••+ E ^^r/(o,oi/+)pr/("^->oi?_)| 



1 1 



a'k 



A/- (/+)!(/_)! V 2 

X [(1 - q^y^+'- - (1 - qf'y+ - (1 - 

where the normalization factor M is the one from ()5.234p . We obtain 



[(1 - - (1 - - 1 - 



(5.246) 



(5.247) 



Sat 



a'fc X — X e " ''^ X 

X I (1 - g")(l - qP) cxp [a'fc(l - <z")(l - qf")] + 
'a'k 



-(2-g"-/)exp 



1(2 -g"-/^ 



+ [l + (l-g")(l--z'')] exp 



'fc 



[l + (l-g")(l-g^ 



-(l-(7")(2-g'^)exp 
-(2-q")(l-g^)exp 



'fc 



(l-<?")(2-g^) 



^(2-0(l-/) 



(5.248) 



For a — > oo we shall have q" = q^ = 1*7 a-nd the leading order term in the numerator is the one 
containing 1 + (1 — g*)^: 



with 

so that 
and 



{1+ + ~ a'fc X 1 X [1 + (1 - qr] exp [1 - (1 - q* f] I 

A/-^2exp|-^[l-(l-,*)2]| 
(Z+ + /_)s,, = ia'fc[l + (l-g*)2]+e-0("') 



2a 



-0(a) 



i + {i-q*r 

Uniqueness of the state 

The joint probabilities are given, for large a, by 

So 1 
JV ^ 2 



(5.249) 

(5.250) 

(5.251) 
(5.252) 



a/3 

Poo 



exp ■ 



Q/3 a!3 -S*! 1 _ 
V,j_ = p = — T~ e 



a/3 

Po+ 



Q/3 



a/3 Q/3 
P+0 = P-0 



Af ^ 2 



exp 



) c/fc 

-0(q) 

7 

g'fc 

~ 2*= 



1-1- 



2fe-i 



(5.253) 

(5.254) 
(5.255) 
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This confirms that the off-diagonal terms are exponentially suppressed, and that Pqq <^ P--- 
Apart from a fraction of variables of order e"*^^"^ we see that the variables are constrained and must 
take the same value in the two states a and /3, so that there is actually only one unique state. 

The solution to the saddle point equations that wc found in Section [5.21 is therefore unique. 

5.6 Discussion of the results and conclusion 

In Paragraph 15.4.41 1 have drawn the conclusion of this work: that the proof of convergence of WP 
provided in [75] for formulae extracted from the Planted Ensemble can be extended to formulae ex- 
tracted from the Satisfiablc Distribution. As wc have seen, this contradicts a probabilistic version of 
Hypothesis 2. There arc two questions that remain open and deserve attention. 

The first regards Feige's complexity result. Theorem 1 was based on a deterministic form of 
Hypothesis 2, which is weaker than the probabilistic version refuted by the previous results. It would 
be very interesting to understand whether the hypotheses of Theorem 1 can be relaxed, and some 
conclusion reached on the basis of the refutation of Hypothesis 2p. 

Even more interesting, from the physicist's point of view, is the second question. The above 
discussion for fc-SAT can be easily extended to other models, such as fc-xORSAT. The characterization 
of the solutions to large a satisfiable formulae in terms of the distribution of fields can be repeated, 
with similar results: that a fraction 1 — e^^f") of the variables are constrained to take a unique value 
in all the solutions, and that the fields acting on the variables are of order 0{a). However, there 
is a crucial distinction between fc-SAT and fc-xORSAT: the correlation between the sign of the field 
acting on a variable and a bias in the number of occurrences between it and its negation, which is 
present in fc-SAT, cannot be present in fc-xORSAT for obvious symmetry reasons. Since this is a crucial 
ingredient of the convergence of WP, it should not be expected to apply to fc-XORSAT. It would then 
be very interesting to find an algorithm which identifies satisfiable fc-xORSAT formulae at large a, and 
to understand the implications this would have on Theorem 1. 
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List of notations 



= Identical to 

~ Asymptotically equal to, leading order in asymptotic expansions 

~ Approximately equal to 

n m Integer division of n by m 

P[-] Probability 

E[-] Expected value 

I[event] Indicator function of event, equal to 1 if event is true and otherwise 

V Logical OR 

A Logical AND 

e Logical XOR 

Cardinality of set .y 

(•) Thermodynamic average 

O Average over disorder of O 

i, j, fc, . . . Site indices from 1 to 

a, 5, c, . . . Replica indices from 1 to n 

CTj Individual spin 

a A^-componcnt spin configuration 

<T Replicated N x n spin configuration 

(t" A^-component spin configuration of replica a 

ai n-component spin configuration on site i 

a, f Generic n-component spin configurations 

cr" Value of spin on site i for replica a 

a Ratio between number of clauses M and number of variables A^ in a boolean 

constraint satisfaction problems 

as Threshold value for SAt/unsat transition 

ttc Threshold value for clustering transition 

ao Lower bound on Os from the second moment inequality 

ah Largest value of a for which a poissonian DPLL heuristic succeeds with positive 
probability 

Sc Clustering transition surface 

Ss SAt/unsat transition surface 

Sk Critical surface (i.e. intersection of Sc and Eg) 

Sq Contradiction surface 

!F fc-SAT formula 

T'unif [-^l Uniform measure over random formulae 

'Psstt[J'] Uniform measure over satisfiable formula; 

T'unif [-^1 Planted measure over random fdiiSiulae 



c{a) Fraction of sites with replicated configuration a, functional order parameter 

R(h) Distribution of fields, functional order parameter equivalent to c{a) 

^ Free energy density functional 

V "Thermodynamic potential" , v = I3n as /? oo and ri — > 

eo(j^) Ground state energy density of formula: conditioned on v 

Vp Weight of R{h) over h = p € Z 

Ip (x) Modified Bessel function of integer order 

Pp Limit of rp for v ^ oo 
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